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INFLUENCE OF PUPILS’ ATTITUDES ON PERCEPTION 
OF TEACHERS’ BEHAVIORS AND ON 


CONSEQUENT SCHOOL WORK’ 


JANICE B. GOLDBERG 
University of Maryland, Baltimore 


In a study to determine whether attitudes toward authority and 
school work are associated with differential perception of teachers’ be- 
haviors and school performance, 254 8th- and 9th-grade boys classified 


as high or low on the California F Scale, Flexibility Scale, and Com- 
pulsivity Seale rated teacher behavior and reported the amount of 
school work performed. High compulsives perceived teachers as sig- 
nificantly less authoritarian than did low compulsives and did less 
work when the teacher was perceived as nonauthoritarian. 


Pupils who achieve well in one teacher's 
classroom may achieve poorly in another 
teacher’s class, Similarly, in any classroom 
some pupils achieve well, while others, 
equally intelligent, achieve poorly, Little 
is known about differential pupil reaction 
to teachers and its consequent effect upon 
pupils’ school performance. 

Some investigations of pupil-teacher re- 
lationships have used pupils’ ratings of 
teachers’ classroom behaviors which are 
conceptualized along personality dimen- 
sions related to the authoritarian versus 
the nonauthoritarian personality pattern 
(Amidon & Flanders, 1961; Cogan, 1954; 
Flanders, 1951). The intent of these stud- 
ies has been to show that authoritarian- 
related teacher behaviors elicit pupil 
anxiety which results in lowered pupil 
achievement. The findings of these studies 
are inconclusive. The validity of using 
pupils’ ratings as a method for determin- 
ing differential pupil reaction to teachers 
stems from the fact that pupils observe 
more of the teacher’s typical behavior 
than is usually available to the outside 
observer. Moreover, pupils are directly in- 
volved in the teaching-learning process. 

There are two major shortcomings in 
most pupil rating studies. One is the pool- 


* This study is based on a dissertation submitted 
to the Graduate School of Education of Harvard 
University in partial fulfillment of the require- 
ments for the Doctor of Education degree. The 
research was supported by a grant from the Mil- 
ton Fund. The author wishes to thank her ad- 
visors, D. W. Oliver and G. W. Goethals, for their 
assistance in the execution of the study. 


ing of all pupils’ ratings without considera- 
tion of individual differences in pupils’ 
perceptions despite the fact that extensive 
research has shown that individual per- 
sonality factors influence perception 
(Bruner, 1958). The other is the use of 
broad variables, for example, “liking the 
teacher.” Such global variables do little 
to clarify the complexity of teacher-pupil 
relationships and require considerable in- 
ference on the part of pupils. To avoid 
these shortcomings this study investigated 
the relationships between pupils differ- 
entiated in their attitudes toward author- 
ity and their perceptions of specific, de- 
notable teacher behaviors. These specific 
behaviors require less inference by pupils. 

The study rests on the premise that 
differential pupil reaction to teachers may 
be due to underlying attitudinal factors 
which influence pupils’ perceptions of 
teachers’ behaviors and their consequent 
performance of school work. It is also as- 
sumed that teachers’ attitudinal factors 
predispose them to behave in a particular 
fashion and that these behaviors can be 
identified. 

While many personality factors influence 
perception and behavior, this study relies 
upon the measurement of attitudinal vari- 
ables comprising the authoritarian versus 
the nonauthoritarian personality dimen- 
sion (Adorno, Frenkel-Brunswik, Levin- 
son, & Sanford, 1950) as a significant de- 
terminant of teacher behavior and as an 
influence on pupil perception. In view of 
findings that high and low authoritarians 
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differ in their perceptions of others 
(Scodel & Mussen, 1953), in their study 
habits (Gladstein, 1957), and in their 
ability to learn different kinds of subject 
matter (Neel, 1959), it is suggested that 
pupils who are differentiated in their at- 
titudes toward authority will differ in their 
perceptions of teachers’ behaviors related 
to the authoritarian versus the nonauthori- 
tarian personality dimension (Adorno et 
al., 1950). 

These hypotheses were tested: (a) 
Pupils differentiated as high or low on the 
California F Scale, on the Flexibility Scale, 
and on the Compulsivity Scale will differ 
in their perceptions of teachers’ classroom 
behaviors. (b) The ratings of teachers’ 
classroom behaviors by pupils differenti- 
ated as high or low on the three attitude 
scales are related to the amount of re- 
quired and self-initiated work pupils re- 
port they perform. 


Meruop 


There are three important features in the re- 
search design. One is the use of specific, denotable 
classroom teacher behaviors which are operation- 
ally defined in terms of the variables underlying 
the California F Scale (Adorno et al., 1950) rather 
than the more common use of global variables to 
describe teachers’ behaviors. The characterization 
of teachers’ behaviors is based on a well-defined 
personality theory, that is, the authoritarian per- 
sonality dimension, since the variables comprising 
this dimension are expressed by teachers in many 
ways in the classroom. The second is the assess- 
ment of pupils’ attitudes on a personality dimen- 
sion which taps attitudes toward authority. This 
seems appropriate since the authority vested in 
the teacher is an important construct in the pupil’s 
daily school life. The third important feature is 
the use of two unique criterion variables devised 
by Cogan (1954)—the amount of required work 
and self-initiated work performed by pupils. While 
these criteria do not directly measure pupil change, 
for example, as measured by standardized achieve- 
ment tests, they avoid the pitfalls of these tests. 
Cogan argued that performance of pupil work is 
closely related to pupil change (or gain) and “in- 
tervenes just prior to pupil change.” 


Subjects 

Subjects were 254 eighth- and ninth-grade bo: 
and their 12 male social studies teachers in his 
Junior high schools in a Boston suburb. 
Instruments 


All measures (pupils’ attitudes ils’ rati 
1 , pupils’ ratin; 
of teacher behaviors, and pupils’ reports of aa 
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quired and self-initiated work) were secured fro} 
one questionnaire. Administration procedures we 
standardized, and pupils were given assurances o| 
anonymity. 


Independent Variables 


To differentiate pupils according to their attj- 
tudes toward authority, three instruments we 
used: (a) McGee's (1955) revised version of th 


ant of ambiguity, to be rigid in their thinking, an 
to show perceptual distortion when rating others, 
Split-half reliability is 86. (b) Gough’s (1956) 
Flexibility Scale was also modified for this study, 
This scale measures desire for order and certainty, 
especially in intellectual matters. High scorers 
resist learning ambiguous material found in soci 
studies content, a resistance associated with 
authoritarian’s intolerance of ambiguity. The scale 
correlates (r = .59, p < 01) with the F Scale 
described above. Inter-item reliability is .75, (c) | 
Berlak’s (1959) Compulsivity Scale was shortened 

for this study. This scale measures attention to 
detail and order, particularly in school work. High 
scorers (high compulsives) may be said to hayea _ 
strong desire to do well in school while the con- 
verse is true for low compulsives. These assump- 
tions regarding the differences in the attitudes of 
high and low compulsive pupils seem tenable in 
view of Oliver and Shaver’s (1962) finding of a 
strong relationship (r = 57, p < .005) between 
Compulsivity and “need cognition,” defined as “a 
desire to do well in school.” The overall rigidity 
of behavior measured by the scale reflects atti, 
tudes consistent with authoritarianism. The scale 
correlates (r = 20, p < .05) with the F Scale. In- 
ter-item reliability is .76. A Likert-type scale is 
provided for each of the three attitude measures 
with six possible responses to each item. The 
Tesponses range from “strong agreement” to 
“strong disagreement.” 


Dependent Variables 


Descriptive teacher behaviors were related, by 
hypothesis, to variables representing the underly-” 
ing personality trends measured by the California 
F Scale. In turn, these descriptive behaviors, hy- 
Pothesized to be manifestations of the authori | 
tarian versus the nonauthoritarian personality 
pattern, were matched with specific, denotable 
teacher behavior items drawn from Cogan’s (1954) 
Pupil Survey. The criterion guiding the process 0 
determining which denotable behavior item corte 
sponded to the descriptive behavior was the fun 
tional relevance of the item to the behavior. The 
process of matching descriptive behaviors and 
thereby relating F-Scale variables to denotable 
teacher behaviors is demonstrated as follows. One 
F Scale variable comprising the authoritarian pet _ 
sonality structure is termed “Anti-intraception.” * 
This personality trend is characterized by imp | 
tience with the subjective and the tender-minded. 
It was hypothesized that such a teacher would 
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tend to be unconcerned with what pupils think and 
feel, and might be contemptuous of the academi- 
cally poor pupil. The descriptive behavioral mani- 
festation of this variable is the statement, “Teacher 
is unsympathetic with a pupil’s failure at a task.” 
The operational definition of this behavior and, 
therefore, of the F-Scale variable is the question- 
naire item: “When we give a wrong answer in 
class, our teacher says we are ‘slow,’ ‘not smart,’ 
ete.” 

The 25 specific authoritarian teacher behavior 
items characterize the authoritarian teacher, in 
part, as strongly directive, impatient with aca- 
demically inferior pupils, and generally rejecting 
of pupils. High scores on these items represent 
the extent to which pupils perceive teachers as 
authoritarian. 

A similar procedure was used to relate descrip- 
tive nonauthoritarian teacher behaviors to specific 
Pupil Survey items. The 23 specific nonauthori- 
tarian teacher behaviors represent personality 
trends hypothesized to be opposite to the meaning 
of the F-Scale variables.* These items characterize 
the nonauthoritarian teacher, in part, as permis- 
sive, more concerned with individual pupils’ needs, 
and generally accepting of pupils. High scores on 
these items represent the extent to which pupils 
perceive teachers as nonauthoritarian. 

A 5-point frequency scale was provided for each 
of the items. Responses to the items range from 
“Almost never” to “Very often.” Cogan (1954) re- 
ports a reliability coefficient of .96 for these teacher 
behavior items. Pupils’ ratings of these items are 
the dependent variables of the study. 


Criterion Variables 


To determine the amount of school work per- 
formed by pupils, two measures were used: (a) 
24 items concerned with homework represent the 
amount of required work a pupil does, and (b) 21 
items represent the extent to which a pupil does 
extra, unassigned (se/f-initiated) school work. 
Some examples are: 

Required work: “Give a report” 
Self-initiated work: “I make extra graphs, 
charts, etc,” 

The items were rated on a 6-point frequency 
scale. Responses to the items range from “Almost 
never” to “Almost always.” The reliability coeffi- 
cient for the required work items is .94 and 89 
for the self-initiated work items (Cogan, 1954). 


Resuits anp Discussion 


Hypothesis 1 is partially confirmed. 
Table 1 shows that compulsivity in the 
total sample is strongly related to pupils’ 
perceptions of teachers’ behaviors. Pupils’ 
F-Scale and Flexibility Scale scores are 


*Complete tables of descriptive teacher be- 
haviors and specific items may be found in the 
author’s unpublished doctoral dissertation (Gold- 
berg, 1965). 


TABLE 1 
CoRRELATIONS BETWEEN Pupius’ ScoRES ON THE 
ArriTupE ScaLEs AND THEIR PERCEPTIONS OF 
TEACHERS’ BEHAVIORS AND THE AMOUNT 
or RequireD aNnp SxExF-IniT1aTED 
Work PERFORMED 


Pupils’ scores 
F sale PSHE” | set Seale 

Teachers’ behaviors 

Authoritarian —0.07 | —0.08 | —0.23* 

Nonauthoritarian 0.04 0.08} 0.23* 
Work performed 

Required 0.02 | 0.08 0.35** 

Self-initiated 0.01 0.02 | 0.43** 

Note.—N = 254, 

*p< Ol. 

** Dy < .005. 


not related to their perceptions of teachers’ 
behaviors. 

While compulsivity is a component of 
the F Scale (r = .20, p < .05) and of 
the Flexibility Scale (r = .27, p < .005), 
its influence on pupil perception of teacher 
behavior may be due to the fact that it 
measures school-related attitudes rather 
than the generalized attitudes measured by 
the F Scale and the Flexibility Scale. 

The ¢ tests in Table 2 show that when 
pupils are differentiated as high or low 
on the Compulsivity Scale their ratings of 
teachers’ behaviors are significantly dif- 
ferent. High compulsives, those who work 
carefully in order to do well in school, 
perceive teachers as more nonauthori- 
tarian. Low compulsives, those who may 
be less concerned with doing well in school, 
perceive teachers as more authoritarian. 


TABLE 2 


Means oF Hien anp Low Computsive Purius’ 
Ratines or TEACHERS’ BEHAVIORS 


Mean pupil ratings 
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The differences in perception by the 
two groups of pupils may lie in differ- 
ential treatment of these pupils. The 
authoritarian teacher is characterized, in 
part, as strongly directive, impatient with 
academically poor pupils, and may be un- 
concerned with pupils’ personal needs and 
goals. The authoritarian teacher tends to 
demand good school performance, to insist 
on strict order, and to conform to a rigid 
routine. 

Since the Compulsivity Scale measures 
attention to detail and order in school 
work, it has been assumed that the high 
compulsive wants to do well in school. He 
is probably the academically strong pupil 
who conforms to teacher expectations of 
good work. It may be that less demand is 
made of him by his teacher since he does 
good work in accordance with his own in- 
ner needs. He may be favorably treated 
in that he gets less criticism from his 
teacher. Thus, he perceives the teacher as 
more nonauthoritarian, The low compul- 
sive may care less about good school per- 
formance, He is probably the academi- 
cally weak pupil who does not meet teacher 
expectations of good work. The teacher 
may excessively criticize him and may 
persistently require him to do more care- 
ful work. Thus, the low-compulsive pupil 
perceives the teacher as more authoritar- 
ian. It is also possible that pupils’ at- 
titudes toward the teacher as an authority 
figure may result in perceptual distortion 
of the teacher’s behavior, The high com- 
pulsive, with his need to do well in school, 
may perceive teacher demands for good 
work as aiding him to do well in school 
and, therefore, as reasonable behavior. 
Thus, he rates the teacher as less authori- 
tarian. The low compulsive, having less 
need to do well in school, may perceive 
this kind of teacher behavior as unreason- 
able and he rates the teacher as more 
authoritarian, 

Hypothesis 2 is partially confirmed. 
Table 1 also shows that compulsivity in 
the total sample is positively related to 
the amount of work pupils perform. How- 
ever, no significant relationship exists be- 
tween either F-Scale or Flexibility Scale 
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scores and amount of school work per- 
formed. 

Two-way analyses of variance show dif- 
ferences in pupils’ compulsivity and dif- 
ferences in pupils’ perceptions of teachers’ 
nonauthoritarian behaviors do influence 
their performance of required and self- 
initiated work. There is a highly significant 
interaction effect between pupils’ compul- 
sivity and pupils’ perceptions of nonau- 
thoritarian teacher behaviors with self- 
initiated work as the criterion measure 
(F = 19.17, df 1/218, p < .01). The re- 
quired F ratio at this level is 6.76. Inter- 
action between pupils’ compulsivity and 
their perceptions of nonauthoritarian 


teacher behaviors with required work as : 
the criterion measure is significant at the ° 


.05 level. 

The cell means for these analyses show 
that high-compulsive pupils do less work 
when the teacher is perceived as nonau- 
thoritarian. Low-compulsive pupils do 
more work when the teacher is perceived 
as nonauthoritarian. Although the data de- 
rived for Hypothesis 1 indicate that high- 
compulsive pupils perceive teachers as 
more nonauthoritarian, these same cell 
means show that this perception of the 
teacher influences their performance of 
school work, These findings suggest that 
perception of teacher behavior as non- 
authoritarian may conflict with high- 
compulsive pupils’ need for a directive, de- 
manding teacher. Thus, perception of non- 
authoritarian behavior appears to serve as 
cues for anxiety in high-compulsive pupils, 
resulting in less performance of school 
work. Conversely, perception of nonau- 
thoritarian behavior seems to encourage 
low-compulsive pupils to do more work, 
probably because this kind of teacher 
tends to have warmer relationships with 
pupils—even those who have less desire to 
do well in school. 

Two-way analyses of variance of pupils’ 
compulsivity and their perceptions of 
teachers’ authoritarian behaviors show 10 
Significant interaction effects, However, 
trends in the cell means reveal that high- 
compulsive pupils do more work when 


ee 
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i they perceive the teacher as authoritarian 


than do low-compulsive pupils. This lends 
weight to suggestions made earlier that 
high-compulsive pupils perceive this kind 
of teacher behavior as enabling them to 
fulfill their need to do well in school. Low- 
compulsive pupils do less work when the 
teacher is perceived as authoritarian. Thus, 
it is possible that such teacher behaviors 
may serve as cues for anxiety resulting in 
lowered school performance for those pupils 
who are’ less concerned about good school 
work. 

The findings tend to support the con- 
clusion that pupils differentiated in their 
attitudes—in this case on a measure of 
compulsivity—do perceive different kinds 
of teachers’ behaviors differently and that 
this differential in perception influences 
the consequent amount of school work 


performed. 
In view of contemporary concern for 
teaching “disadvantaged” children who 


tend to have little interest in good school 
performance, these results may be helpful 
in selecting teachers for these children as 
well as for studying their learning patterns. 
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The experiment described tested the effectiveness of the time-com- 
pressed speech technique (tape recordings presented at rates exceeding 
recording rates, but with normal pitch) in presenting material under 
conditions of massed practice in listening to time-compressed speech. 
A small group of Ss received practice material for about 7 hr. a day for 
5 consecutive days at rates about 2/2 times normal speaking rates (at 
425 words per minute). Benchmark passages and tests were presented 
daily. Results showed the comprehension increased from a mean of 
about 40% of normal speed comprehension on Day 1 to a mean of 
70% comprehension in Day 5. While effective, the massed-practice 
procedure produced no better performance in a total of 35 hours of 
practice than previous experiments using spaced practice of 1-2 hr. per 


day produced in a total of 12-15 hr. of practice. 


For some years there has been a growing 
interest in the extent to which the human 
being can comprehend auditory material 
presented at rates more rapid than normal. 
Early work in this area was frustrated by 
the fact that speeding up recorded material 
produced a frequency or pitch shift which, 
in itself, interfered with comprehension as 
much as did the rapidity of the presen- 
tation, However, beginning about 15 years 
ago, interest began to develop in a process 
of time-compression which would preserve 
the normality of pitch and yet, at the 
same time, permit the speed-up of the 
presentation. The first work along these 
lines was done by Garvey (1953), who 
carefully sliced out systematic deletions 
from a piece of recording tape, splicing 
the remaining pieces together to form a 
continuous record. Garvey’s experimental 
results showed that subjects (Ss) could 
understand his tape almost as well as they 
could one presented in a normal fashion. 
Shortly thereafter Fairbanks, Everett, and 
Jaeger (1954) developed an electro-mechan- 
ical device at the University of Illinois, 
which, in effect, performed the same thing 
automatically that Garvey had done with a 
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razor blade. Again, the results were the 
same. Fairbanks, Guttman, and Miron 
(1957) reported that Ss were able to com- 
prehend this material at rates ranging close 
to 300 words per minute with essentially 
no loss in comprehension. Somewhat later, 
Bixler, Foulke, Amster, and Nolan (1961) 
reported an attempt to apply the time-com- 
pression principle to the problem of provid- 
ing material at a more rapid rate for blind 
students. Foulke and his group used a Ger- 
man machine called the Tempo Regulator 
which operated on a principle essentially 
similar to that used by Fairbanks. Again, 
results were favorable in that his Ss re- 
ported being able to understand material 
at somewhat greater than normal presenta- 


tion rates without much loss in compre- 


hension., 

Beginning in 1963 the senior author be- 
gan to study the applicability of time- 
compression as a more general educational 
technique. Two basic questions were posed: 
(a) Can college-level students understand 
college-level material without loss of com- 
prehension, when presented at modest rates 
of compression; and (b) as higher rates 
are introduced will the corresponding loss 
of comprehension be amenable to simple 
training (practice) routines? Initial ex- 
perimentation by the present authors over 
a period of a year and half provided an 


affirmative answer to both of these ques- 


tions (Orr, Friedman, & Williams, 1965). 


- 
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CoMPREHENSION oF Time-Compressep SPEECH Vi 


PROBLEM 


Since that time the present authors have 
been engaged in experimental work de- 
signed to explore some of the other interest- 
ing questions which can be posed about 
the comprehension of time-compressed 
speech. Our initial experimentation had 
employed both graduated levels of practice 
ranging from approximately 325 words 
per minute up to 475 words per minute, 
and concentrated practice where all prac- 
tice material was presented at approxi- 
mately 425 words per minute. The results 
of an experiment conducted recently to 
determine the effectiveness of massive 
practice in learning to comprehend time- 
compressed speech are reported here. 

In the case of both graduated and con- 
centrated practice groups, practice ses- 
sions were confined to about an hour to 
an hour and a half per day and the entire 
period of experimentation was spread out 
over a period of 4-5 weeks. The purpose 
of the presently reported experimentation 
was to determine whether or not a con- 
centration of massive practice in terms of 
a large number of clock hours per day 
could produce essentially comparable re- 
sults in a few days. The rationale for the 
problem is two-fold. In the first place, if 
it should, as seems likely, become feasible 
to apply time-compressed speech as a 
general educational technique, it might be 
necessary to have naive students spend 
some amount of time practicing the com- 
prehension of time-compressed speech as 
a precursor to their regular studies. If this 
were the case, it would be desirable to have 
such a training course occupy a minimum 
number of days at the beginning of the 
term. Secondly, the experience of the 
Armed Services in recent years in attempt- 
ing to teach a second language has shown 
a fair amount of success for intensive or 
immersion exposure to the target language. 


PROCEDURE 


The “immersion” study Ss consisted of seven 
male students, between ages 19 and 20, at the 
freshman or sophomore college level. English was 
their native language and none had a marked 
Tegional accent. The average letter grade for all 


students in their last semester in college was a 
C+. Two of the Ss had some training in rapid 
reading but in neither case was the course com- 
pleted. None of the Ss had had any form of train- 
ing in listening. All of the Ss were screened audio- 
metrically for normal hearing. 

Tn the first session, Ss were given a brief talk 
explaining that the purpose of this study was to 
provide intensive exposure to speeded speech, and 
to measure listening performance with periodic 
benchmark tests. They were also given a_bio- 
graphical data sheet to fill out which called for 
basic information about their backgrounds, They 
were then given alternate forms of the Nelson- 
Denny reading test, which measures reading com- 
prehension, rate, and vocabulary. This was fol- 
lowed by the presentation of a historical passage 
(taken from the same book as the later bench- 
mark passages) which was presented at normal re- 
cording speed (175 wpm). A standard multiple- 
choice test on the information contained in the 
passage was then given. A similar passage and 
test was then presented at 475 wpm as an initial 
measure of high-speed performance. The Ss were 
then asked to return for five consecutive week- 
days, beginning on a Monday, from 9:00 am to 
9:00 pM. 

During the next week, 12 novels were played 
at 425 wpm as practice materials for the Ss, The 
experiment was conducted in a semi-soundproofed 
room and materials previously compressed on the 
Tempo Regulator were played back on a Magne- 
cord tape recorder through a Bogen amplifier and 
two Electro-yoice speakers. 

On each day listening material was presented 
for approximately 48 minutes without interruption, 
At the end of that time a brief written quiz, in- 
cluding both short-answer and essay questions, 
was administered to Ss during a 10-minute period. 
This was followed by a 5-minute rest period, after 
which the cycle was repeated. The Ss were given 
1 hour for lunch during the afternoon, and 1 hour 
for dinner in the evening. During the latter part 
of each of the 5 days of exposure, a new bench- 
mark passage and test, similar to the preexperi- 
mental material, was administered. Each test was 
presented at 425 wpm. Near the end of the fifth 
day, the initial high-speed benchmark passage was 
presented again at 475 wpm. The time schedule 
for the study is shown in Table 1. ; 

‘At the end of testing on Wednesday and again 
in a postexperimental session the following Thurs- 
day, Ss were asked to rate the novels they had 
listened to, on a 5-point scale, covering the fol- 
lowing aspects of the presentations: Overall ability 
to comprehend, i telligibility (clarity of indi- 
vidual words), difficulty of subject matter, in- 
terest in the book, quality of speaker's voice, and 
quality of speaker's diction. An alternate form of 
the Nelson-Denny reading test was presented as a 
postexperimental measure of change. The Ss were 
then given extensive debriefing questionnaires to 
complete calling for subjective comments on the 


® Repeat of preexperiment measure. 
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TABLE 1 
Practice MATERIALS AND SCHEDULE OF PRESENTATION 
(all at 425 wpm) 
' Ti ‘i Time 
Material in minutes) meter (in minutes) 
1 Day 3—Continued 
De haan by the Dozen 45 Man ae ee Kumaon 19 
ui 10 reak (dinner) 60 
ri bi os C4 ae 0 
SE ie 9 Man Bates of Kumaon 3% 
the D joie 
Mae "gu Sait 10 Break 10 
Quis 10 Man Eaters of Kumaon 48 
Break (lunch) 60 aiz 5 
Diary of nett Girl ou Deed juestionnaire 15 
; Break as 10 ‘How to Wee Friends and Influence People 5 
Diary of a Young Girl fies a 
real 15 How to Win Friends and Influence People 47 
Diary of a Young Girl 43 Quis 6 
iz 5 Break 18 
reak crear! 60 How to Win Friends and Influence People 33 
Benchmark passage C-4 RB reak (lun 64 
Test C-4 13 How to Win Friends and Influence People 16 
Diary of a Young Girl 48 juis 3 
A How to Win Friends and Influence People al 
I Owe Russia $1800 cc) How to Win Friends and Influence People 16 
Quis 8 ui 4 
Day 2 iS Run Silent, Run Deep 24 
T Owe Russia $1200 60 Break iL 
real 10 Run Silent, Run Deep 64 
T Owe Russia $1200 25 Bi 10 
wis z Run Silent, Run Deep 89 
uls 
The “Miracle” New York Yankees - i be ae) 7 
Sia Ebon aseth a coat Run D 30 
func! 
The “Miracle” New York Yankees 8 tis Sree ae 15 
jilent, D 
The “Miracle” Now York Yankees 2 ae en ean De? 5 
juiz, 4 Run Silent, Run Deep WT 
reak 20 pai 15 
To Kill a Mockingbird 87 Day 5 
iz 2 Run Silent, Run Deep 45 
To Kill a Mockingbird 54 Run Silent, Run Deep 60 
ais, 6 Quis 10 
. Break 10 Break 18 
ic aM ce reper 54 America’s Race for the Moon 31 
ais a =) é “Americar Guneh 65 
Benatimask pemags C4 Hy ore tounge u 
To Kill. a Mockingbird rm Sor ak eg Rr a 
oe, - America’s Race for the Moon 8 
To Kila Reward 45 Riders ofthe B Purple Sage 20 
15 
Day 3 Rider: 
The Bavitement of Science 8 woth Pe pieacie, S00", By 
The Excitement of Science 4 (ok ofthe Purple fone ae 
tits 
The Bactement of Science 0 Geaeceslune wns soe % 
The Hecitement of Science ii pret (cine) (4 
peak (onch) ha ee Test C-2 “ 10 
The Foran: Pioneer 65 mark Deaege C-3! Ae 
5 Riders of the Purple Sage 20 
Man Ee rea 1% Break 10 
fan Eaters of Kumaon 47 Riders of the Purple Sage 61 
Peak: a Quiz 5 
Man Haters of Kumaon 52 
Quiz 7 
Break 5 
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TABLE 2 
BuncumaRK Test SCORES Correctep For CHANCE AND PercenTaces or NorMAL Spm Scores 
Initial measures Practice at 425 wpm Final measures 
Subject «) ; r MEG ofc 
ee High speed! Dayi | Day2 | Day3 | Day4 | Day | Renested Bish soe 

A 

Score 16.25 0.00 6.50 2.75 7.50 8.75 9.37 9.37 
3” 100.0 0.0 40.0 16.9 46.2 53.8 57.7 57.7 

Score 14.0 5.21 9.50 9.75 8.75 7.50 9.58 12.29 
ss % 100.0 87.2 67.9 69.6 62.5 53.6 68.4 87.8 

Score 25.00 8.33 15.00 8.75 | 16.25 | 10.00 | 17.70 11.45 
pb” 100.0 33.3 60.0 35.0 65.0 40.0 70.8 45.8 

Score 22.50 8.95 16.25 | 11.50 14,00 | 13.75 | 16.09 15.62 
a” 100.0 39.8 72.2 51.1 62.2 61.1 71.3 69.4 

Score 19.25 1.46 0.75 3.75 6.75 14.00 9.58 6.66 
5 % 100.0 7.6 3.9 19.5 35.1 72.7 49.8 34.6 

Score 14.25 0.00 1.25 6.50 1.25 3.75 | 11.45 4.16 
i 100.0 0.0 8.8 45.6 8.8 26.3 80.4 29.2 

Score 10.0 2.08 3.00 0.50 4.00 6.50 9.16 3.96 

% 100.0 20.8 30.0 5.0 40.0 65.0 91.6 39.6 
Mean 

Score 17.32 3.72 7.46 6.21 8.36 9.18 | 11.84 9.07 

% 100.0 19.8 40.4 34.7 45.7 53.2 70.0 52.0 


procedures, materials, and potential usefulness of 
compressed speech in the educational setting. 

Upon the completion of the experiment, each 
8 was paid $100.00 plus a bonus of $25.00 to the 8 
who demonstrated the greatest proficiency on the 
benchmark tests. 


RESULTS AND Discussion 


Results of the benchmark tests in terms 
of number of questions correct, based on 
25 item tests, corrected for chance, are 
shown in Table 2. Also shown is percentage 
of normal speed performance, calculated 
separately for each individual based on 
his own performance at normal (175 wpm) 
speed. It may be noted that there is @ 
progression of means from 40.4% on Day 
1 to 70.0% on Day 5, which is reasonably 
steady with the exception of a dropback 
on Day 2. In addition to this improvement, 
mean performance on the repeated high- 
speed passage (475 wpm) also improved 
from 19.8% to 52.0%, which is a signifi- 
cant improvement at the 1% level and 
significantly greater than that of a control 


group from previous experimentation. The 
progression of means is shown graphically 
in Figure 1. 

With the exception of the 475 wpm 
passage and test, the figures shown in 
Table 1 and Figure 1 are based upon dif- 
ferent tests, and test passages are thus in- 
dependent estimates of performance. Pas- 
sages were taken from the same book of 
early English history, however, and tests 
were constructed to be equivalent accord- 
ing to item statistics derived from the same 
population of students. 

‘As an illustration of the extent to 
which variables such as type of test and 
material and subject variability can affect 
the results, however, one may consider the 
results for the short-answer tests and 
essay tests on the practice materials. 
These were not intended to do more than 
motivate Ss to listen to the practice ma- 
terials, and it was not possible to stand- 
ardize these measures. The results empha- 
sized the extent to which the evaluation of 
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Fig. 1. Mean percentage of normal performance 
over 5 days (N =7). 


comprehension of compressed speech is 
dependent on such variables, showing no 
discernible trends and great subject vari- 
ability. The lack of correlation between 
short answer and ‘essay results (r = 0) 
indicated further that these measures are 
not very dependable. To the extent that 
research such as this depends upon such 
“pick-up” measures, it is certainly open 
to question. 

With respect to pre- and posttest scores 
on the reading and standard listening 
tests, mean increases of about 7-8% were 
observed in each case. Of course, this 
modest figure was not significant with an 
N of only 7, 

The results of the present experiment 
again confirm previous findings (Orr et al., 
1965) that comprehension of time-com- 
pressed speech can be improved by simple 
practice routines to relatively high levels 
at speeds of about 24% times normal, By 
the end of the week, all Ss had reached 
the 50% comprehension mark, although 
several started from as low as 5-10% on 
the first day. There is reason to believe 
that some further increases would have 
been obtained had the experiment been 
prolonged. The effects of the training on 
other variables such as reading test scores 
and standard listening test scores, while in 
the right direction, were not great, how- 
ever, 

Another question of interest is the rela- 
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tive effectiveness of the method here em- 
ployed to achieve approximately 70% 
comprehension at 425 wpm. During their 
5 days of intensive exposure, these §% 
spent approximately 35 hours listening to 
compressed speech. Their results can be 
compared with those of three previous 
groups of similar composition, all of which 
received about 12-14 hours of practice 
distributed at about 1-2 hours per day, 
2-3 days per week, over about 4-5 weeks. 

Group A. Practice with speeds gradually 
increasing from 325 to 425 wpm; 

Group B. Graduated practice from 325 
to 425 wpm (with 3-minute breaks every 
10 minutes) ; 


Group C. High-speed practice (425 wpm 


only). 

The mean result for Group A was 79% 
of normal at 425 wpm; for Group B, 80% 
at 425 wpm; and for Group ©, 71% at 425 
wpm. Thus, the investment of 12-14 hours 
of spaced practice produced results as 
good or better than the investment of al- 
most three times as much practice in the 
present “immersion” study. A similar con- 
clusion was reached after looking at the 
mean improvements from pre- to postex- 
perimental scores on the repeated high- 
speed (475 wpm) test passage. 

With respect to the subjective comments 
gathered on the debriefing questonnaire, 4 
few comments can be made, All Ss felt that 
practice had improved their ability to com- 
prehend compressed speech and five of the 
seven felt that more practice would lead 
to further improvement. Attention wander- 
ing, particularly on less interesting parts of 
the practice material, was seen as one of 
the chief problems. However, most felt 
that their powers of concentration had been 
improved by the experiment. Finally, all 
Ss felt that their performances would be 
improved with the use of learning aids, 
such as outlines, key words, abstracts, 
simultaneous availability of the text, and 
selected repetition. Further experimenta- 
tion is being conducted. 

The findings of the present study tend 
to reinforce the conclusions that time- 
compressed speech offers substantial pos- 
sibilities as an educational technique. 


a 


ComPREHENSION oF Trme-CompPrEsseD SPEECH Baek 


Comparatively high levels of comprehen- 
sion are possible after comparatively lim- 
ited amounts of training at substantially 
increased rates of speed. Rather high lev- 
els of comprehension at 244 times normal 
speed can be obtained in a period as short 
as one working week by means of concen- 
trated practice. Thus, the findings do con- 
firm the feasibility of a concentrated train- 
ing course in compressed speech as a 
prelude to the regular school term, if the use 
of highly compressed speech should become 
a usual educational practice. 

Of course, it must be emphasized that 
the application of time-compressed speech 
in the classroom does not mean the elim- 
ination of the professor. The technique is 
seen as a way to maximize his effectiveness. 
Routine material, survey material, etc., 
can be effectively presented in this man- 
ner, leaving the professor free to concen- 
trate on those aspects of his teaching 
which demand more of his skill. For exam- 
ple, after presenting the basic materials for 
a day’s discussion in half the usual time, 
he would be free to conduct a postpresenta- 
tion critical analysis of the material, to 
organize round-table discussions, and to ap- 
ply any one or more of the many educa- 
tional techniques that the professor cur- 
rently cannot apply in the classroom 
through a lack of time. 


It is recognized that the proposed tech- 
niques are not likely to be suitable for all 
kinds of material, and currently experi- 
mentation is underway to determine the 
types of material for which the proposed 
technique is most suitable. Further, the 
technique is not equally effective with all 
students (as what technique is?), and in- 
quiries are underway to attempt to de- 
termine those student characteristics which 
seem to interact most closely with success 
in comprehending time-compressed speech. 
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EFFECT OF WORD ASSOCIATIONS ON READING SPEED, 
RECALL, AND GUESSING BEHAVIOR ON TESTS 


S. JAY SAMUELS* 
University of Minnesota 


A paragraph containing words with high-associative (HA) relation- 
ships should be read faster and with better recall than a similar para- 
graph containing words with low-associative (LA) relationships. 
Mean reading time for elementary school Ss in the HA condition was 
43.82 sec, and 58.81 sec. in the LA condition (p < .05). The mean num- 
ber of questions answered correctly for HA was 9.50 and 5.04 for LA 
(p < .001), When college Ss read the same paragraphs, the mean time 
was 35.26 sec. for HA and 38.26 for LA (p < 01). The mean number 
of questions answered correctly was 9.69 for HA and 6.87 for LA 
(p < .001). When required to guess the correct answer, control Ss chose 
significantly more often alternatives which contained words having 
HA relationships with words in the stem of the question. Results on 
reading speed are discussed in terms of the effect of word associations 


on perceptual factors in word recognition. 


Taylor (1963) has indicated that when 
sentence meaning does not suggest the next 
word in a sentence, the reader must engage 
in more careful visual analysis of the next 
word than when sentence meaning does 
suggest the next word. In support of this 
point of view, Goodman (1965) found that 
children were able to read words in the 
context of a sentence which they were un- 
able to read when presented alone. He also 
states that one of the reasons why words 
are misread, resulting in time-consuming 
regressions, is that sentence meaning may 
miscue the reader. In an investigation of 
word-recognition speed, Tulving and Gold 
(1963) found that as the amount of in- 
formation in a sentence increased, the time 
required to recognize a target word de- 
creased. Similarly, O’Neil (1953) and 
Rouse and Vernis (1963) demonstrated 
that when word associates are tachisto- 
scopically exposed in succession, recogniz- 
ing the first word aids in recognizing the 
second word, and the stronger the asso- 

‘ciation between the words, the lower the 
recognition threshold. 

Not only do word associations influence 
speed of word recognition, but they in- 
fluence recall as well. Rosenberg (1965) 
demonstrated that more words designated 
as stimulus or response words were recalled 
after hearing a paragraph containing highly 
associated words than after hearing a sim- 


* Appreciation is extended to Barbara Best for 
her valuable help on this study, 
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ilar paragraph containing words with 
lower associative relationships. Deese 
(1959) also found that more words can be 
recalled after exposure to a word list con- 
taining associated words than after expo- 
sure to a word list containing words which 
were not associatively related. 

Because of these findings, it was pre- 
dicted that a paragraph containing words 
with high-associative relationships would 
be read faster with better recall than a 
similar paragraph containing words with 
low-associative relationships, It was also 
predicted that when subjects (Ss) are re- 
quired to answer multiple-choice questions 
without having read the paragraph upon 
which the questions are based, they would 
choose alternatives on the basis of the 
strength of the associative relationship be- 
tween words in the stem of the question 
and the response alternatives. 


EXPERIMENT I 


Method 


Subjects. The Ss were fifth and sixth graders 
from Minneapolis elementary schools who were 
randomly assigned to read either a high-association 
(HA) or low-association (LA) paragraph and to 
answer questions based on the paragraph. Thus, 
28 Ss were assigned to HA and 26 Ss to LA parar 
graphs. Forty Ss were assigned to a control condi- 
tion in which they answered questions based on the 
readings but did not actually read a paragraph. 

Materials. Two 150-word paragraphs developed 
by Rosenberg (1965) were used. Rosenberg 8 
lected stimulus and response words of considerable 


— 
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strength from the New Minnesota Norms (Pa- 
lermo & Jenkins, 1964), and these words were used 
jn the HA paragraph. Words such as MAN, CHEESE, 
caRPET, and MOON were designated as stimulus 
words, and from one to five of their stronger as- 
sociates, designated as response words, were in- 
cluded in the HA paragraph. 

The LA paragraph contained the same stimulus 
words as the HA paragraph, but the words desig- 
nated as response words in the HA paragraph were 
deleted. In their place, for the LA paragraph, words 
were used which are not commonly associated 
with the stimulus words. These words were of the 
same Thorndike-Lorge frequency, were semanti- 
cally appropriate in the context, and were gram- 
matically the same as the words they replaced. 

To test recall, 12 multiple-choice questions were 
written. The same questions were given to all Ss 
regardless of the paragraph they read. Each ques- 
tion had four alternatives. One of the four alterna- 
tives was a response word from the HA paragraph 
while a second alternative was a response word 
from the LA paragraph. The two other alternatives 
were used as distractors. To answer the question 
correctly, S had to select the alternative contain- 
ing the response word from the paragraph he read. 

Parts of the HA and LA paragraphs are repro- 
duced below along with one of the questions. 

HA. They were all happy to be together again. 

Outside the moon and stars shone brightly in the 

June sky, and the green grass sparkled in the 

night. 

LA. They were all relieved to be together again. 

Outside the moon and lake appeared clearly in 

the June evening, and the green house sparkled 

in the valley. 

Question. The green sparkled. (a. house, 

b. plants, c. grass, d. emeralds) 

The paragraphs and questions were mimeo- 
graphed on 82 X 11-inch paper. Page 1 contained 
either the HA or LA paragraph, while Pages 2 
and 3 contained the questions. 

_ Procedure. The experimenter (Z) worked indi- 
vidually with Ss. The S was told that he was going 
to be given a paragraph to read, and that after 
reading the paragraph he would have to answer 
questions about the paragraph. The S was also 
told to read the paragraph quickly but carefully, 
and that E would not be able to help § read any 
words. The S was told to look up to indicate he 
was finished reading. At a signal from E, S began 
to Tead. A stopwatch was used to measure the time 
required to read the paragraph. To answer ques- 
tions, S was instructed to circle the alternative he 
thought was correct. 

The Ss in the control condition were tested in 
& group. Their materials contained only the 12 
questions. They were told: “Answer these ques- 
tions as you think they should be answered.” 


Results 


The mean time to read the HA para- 
graph was 43,82 seconds (SD = 6.62) 
while the mean time to read the LA para- 


graph was 58.81 seconds (SD = 36.78). 
Although the ¢ test is a robust test with 
regard to moderate departures from as- 
sumptions regarding homogeneity of the 
variances, the variances in this analysis 
were sufficiently different to warrant the 
use of Welch’s (Winer, 1962) approxi- 
mation to the sampling distribution of the 
t statistic. Using the conservative test, the 
difference between the means was signifi- 
cant ( = 2.04, df = 27, p < .05, one- 
tailed). 

The mean number of questions answered 
correctly for the HA group was 9.50 (SD = 
1.43), while the mean number of questions 
answered correctly by the LA group was 
5.04 (SD = 1.91). The difference between 
the means of the two groups was significant 
(¢ = 9.70, df = 52, p < .001, one-tailed). 

When the mean number of HA alterna- 
tives correctly chosen by HA Ss who read 
the paragraph (M = 9.50, SD = 1.48) 
was compared with the mean number of 
HA alternatives guessed by control Ss 
(M = 5.75, SD = 1.35), the difference 
was significant (t = 11.03, df = 66, p < 
.001, one-tailed). 

When the mean number of LA alterna- 
tives correctly chosen by LA Ss who read 
the paragraph (M = 5.04, SD = 1.91) 
was compared with the mean number of 
LA alternatives guessed by control (Ss 
(M = 88, SD = .85), the difference was 
significant (t = 10.49, df = 64, p < .001, 
one-tailed). 

To test the prediction that when forced 
to guess, Ss would choose alternatives 
on the basis of the associative strength of 
the alternatives, a frequency distribution of 
choices for high- and low-association 
strength alternatives was made for each 
question. The high-association strength 
alternative was chosen more frequently 
for 11 of the 12 questions (sign test, p < 
.001, one-tailed). 


Exprrment II 


Method 


Subjects. The Ss were 135 juniors enrolled in 
an educational psychology course. Fifty Ss were 
in the HA, 58 Ss in the LA, and 32 Ss in the con- 
trol condition. ? 

Materials. The same materials were used for 
the three conditions as in Experiment I. 
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Procedures. HA and LA Ss were tested at the 
same time, After the Ss were seated, they were 
told not to open their test materials until in- 
structed. The test materials were then randomly 
distributed by alternately placing HA then LA 
materials face down on the desks of Ss. The Ss 
were then told that a signal would be given upon 
which they were to begin to read the paragraph 
quickly and earefully. Before they began to read, 
they were told that as soon as they finished read- 
ing the paragraph, they were to find the number 
of seconds it had taken them to read the paragraph 
by looking at the chalk board located in the front 
of the room. Upon the board consecutive numbers 
were written. The Ss found the time it had taken 
them to read the paragraph by locating the last 
number crossed off the board, and they wrote this 
number at the top of Page 2 of the test materials. 
The EF held a stopwatch, and as each second 
passed he crossed off the number indicating the 
time. The 8s were told that once they finished 
reading the paragraph they could not go back to 
the paragraph, Answers to the questions were to be 
indicated by circling one of the alternatives. 

The Ss in the control condition were tested as a 
group at a different time. Instructions were the 
same as those given to control Ss in Experiment I. 


Results 


The mean time to read the HA para- 
graph was 35.26 seconds (SD = 7.66) 
while the mean time to read the LA para- 
graph was 38.86 seconds (SD = 6.45). 
The difference between the means of the 
two groups was significant (t = 2.57, df = 
101, p < .01, one-tailed). 

The mean number of questions answered 
correctly by the HA group was 9.68 (SD = 
1.81), while the mean number of questions 
answered correctly by the LA group was 
6.87 (SD = 2,30). This difference between 
the means of the two groups was signifi- 
cant (¢ = 6,89, df = 101, p < .001, one- 
tailed). 

_ When the mean number of HA alterna- 
tives correctly chosen by the HA Ss who 
read the paragraph (M = 9.68, SD = 1.81) 
was compared with the mean number of 
HA alternatives guessed by control Ss 


(mM = 6.08, SD = 4.29), the difference 
was significant ({ = 8.11, df = 80,p < 
.001, one-tailed). 


When the mean number of LA alterna- 
tives correctly chosen by LA Ss who read 
the paragraph (M = 6.87, SD = 2.30) 
was compared with the mean number of 
LA alternatives guessed by control Ss 


(M = 1.59, SD = 1.16), the difference wag 
significant (f = 14.64, df = 83, p < .001, 
one-tailed). 

To test the prediction that when forced 
to guess, Ss would choose alternatives on 
the basis of the associative strength of the 
alternatives, a frequency distribution of 
choices for high- and low-association 
strength alternatives was made for each 
question. The high-association strength 
alternative was chosen more frequently 
for 10 of the 12 questions (sign test, p < 
.003, one-tailed), 


Discussion 


The results supported the hypotheses 
that word-association strength influences 
reading speed, recall, and guessing behavior 
on multiple-choice tests. The paragraph 
with stronger associative relationships be- 
tween words was read faster and with 
better recall than the paragraph with 
weaker associative relationships. When 
forced to guess on multiple-choice ques- 
tions, Ss tended to choose alternatives 
that had strong associative relations with 
words in the stem of the question. 

Although these results support the hy- 
potheses, the mechanisms through which 
word associations affect reading are not 
disclosed. Eye movement photography 
taken while reading indicates that eye 
Movements consist of fixations, interfixa- 
tions, and regressions. Fixations account 
for about 92-94% of the reading time. 
Gilbert (1959) separated duration of fixa- 
tion into seeing time, central processing 
time, and stabilizing time. The length of 
time given to a fixation depends on the 
difficulty of the passage (Tinker, 1958) 
and familiarity with the textual material 
(Morton, 1964). A regression is a return to 
a previously fixated word and occurs when 
there is need for verification (Bayle, 1942). 
Whatever effect word associations may 
have on the mechanisms just described, an 
assumption is made that word associations 
have little effect on stabilizing and inter- 
fixation time, but may influence reading 
speed by affecting seeing time, central proc- 
essing time, and number of regressions. 
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EFFECTS OF REVIEW AND TESTLIKE EVENTS 
WITHIN THE LEARNING OF PROSE MATERIALS 


ROGER H. BRUNING* 
University of Nebraska 


Effects of format and criterion-test (CT) relevance of within-learning 
review items were tested in a factorial design. A 1500-word passage was 
divided into 6 sections, each followed by from 3 to 7 statement- or test- 
type review items. Test-type and CT-relevant groups showed signifi- 
cantly fewer errors (p < .01) than statement-type and CT-irrelevant 
groups, respectively. CT-irrelevant testing reduced errors (p < 01) 
over both CT-irrelevant statements and a reading contro! group, indi- 
cating that testing independently facilitates learning from such mate- 
rials. Additionally, effects of testing and specific review were seen to 
be additive with CT-relevant testing significantly reducing errors (p < 
01) over both CT-relevant statement and CT-irrelevant tested re- 


view. 


Recent research (Hershberger & Terry, 
1965a, 1965b; Rothkopf, 1966) has indi- 
cated that testlike events within prose 
learning materials may have both specific 
and general facilitative effects on learning. 
Specifically, such events may function as 
additional practice trials for factual and 
conceptual materials and they may also 
serve more generally as one type of en- 
vironmental control for inspection behay- 
iors, affecting such things as set, rate, and 
persistence of the reading responses. Under 
the control of such testing the learner tends 
to engage in more careful inspection of 
the prose document and to search for mean- 
ingful facts and concepts consistent with 
those encountered in the testlike events, 

An earlier study by Rothkopf and Coke 
(1963) presents evidence that such test- 
type review may have a relatively greater 
positive effect on learning than does review 
of identical materials in the form of de- 
clarative sentences. However, results were 
positive only for a mixed-mode design in 
which subjects (Ss) learned materials 
under both review conditions, and when 
independent treatment groups were em- 
ployed, sentence review was at least as 
good as test-type review. The present study 
represents an additional test of the relative 
efficacy of sentence-type and test-type re- 


*The author would like to express his apprecia- 
tion to Kenneth D. Orton, University of Nebeees 
for his most helpful suggestions throughout the 
study and for his critical treading of the manuscript. 


view when independent treatment groups 
are employed. 

Although Rothkopf’s (1966) study pre 
sents strong indication that such testlike 
events function both independently and in 
conjunction with review content, no meas- 
ures of the relative and combinational ef- 
fects of the two variables are available 
since different criterion measures were 
used for each. To provide additional evi- 
dence on this question, the present study 
employed a single test of retention to per 
mit a more direct comparison of the effects 
of testlike events per se with those of test- 
like events which serve the additional 
function of reviewing materials relevant 
to the criterion test. A 2 x 2 factorial 
analysis of variance design was thus em- 
ployed to test the effects of the manipulated 
parameters, review relevance and revieW 
item format, 


Meruop 


Subjects. The Ss were 69 students from an in 
troductory educational psychology course at the 
University of Nebraska. Participation in the eX i 
perimental sessions was a course requirement. 

Materials. A highly factual, descriptive passage 
of approximately 1500 words dealing with charac 
teristics of a fictitious African tribe was formu | 
lated. The passage was divided into six parts, the 
first approximately 150 words long and the others 
approximately 275 words each. Following eadh 
Section was a review page containing from three — 
to seven items dealing with facts presented in that } 
section. | 

Procedure. Variations in the material and for 


| 
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| 
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mat of the review pages constituted the major 
experimental treatments of the study. From six 
to fourteen statements of major factual points 
were drawn from each section. The statements 
from each section were then paired by the experi- 
menter on criteria of difficulty and importance. 
Members of pairs were then assigned at random 
to either a general criterion test-relevant (CR) or 
to a criterion test-irrelevant (CI) condition. For 
each relevance condition two formats for review 
items were developed, statement-type items and 
test-type items. Statement-type items consisted of 
declarative sentences while test-type items on the 
same relevance dimension were identical except 
for the deletion of the key response term, which 
S was required to supply. Knowledge of results for 
test-type items was provided by lifting a tab 
covering the answer. As combinations of the major 
variables, criterion relevance and format, the four 
major treatment conditions were: (a) criterion- 
irrelevant statements (CIS), (b) criterion-irrele- 
vant test items (CIT), (c) criterion-relevant state- 
ments (CRS), and (d) criterion-relevant test items 
(CRT), A fifth reading control group (N = 13) 
had no review, material of any kind within the 
prose materials. This group only read through the 
materials sequentially without review or testing 
within the learning situation. For all Ss, materials 
were learned under only a single treatment condi- 
tion. 

The general criterion test (CT) was composed 
of the 26 review items associated with the final 
five sections of the learning material and pre- 
viously assigned to the CR conditions. Thus these 
items were identical in content to the statements 
of the CRS condition and completely identical in 
both content and format to the test items of the 
CRT condition. On the CT no knowledge of re- 
sults was provided, however, and items were ran- 
domized to control for possible serial effects as- 
sociated with the original learning materials. 

Learning and review materials were put into 
booklet form and coded by treatment condition. 
Booklets were then randomized and handed out in 
a classroom setting. General instructions required 
that Ss read through the materials at their own 
speed in preparation for a posttest and, in addition, 
did not allow Ss to skip ahead or to look back 
once they had finished with a page. Instructions 
specific to learning conditions were contained in 
the booklets. For CIS and CRS (statement for- 
mat) conditions review pages were prefaced by the 
following instruction: 

As a review, some of the major points made on 

the preceding page are the following. 

For CIT and CRT (test format) conditions, re- 
view pages were prefaced by the following: 

Try to answer these questions. After writing the 
answer in the space, check it by lifting the tab 
across from the answer. Please do not look at 
the answer until you haye answered on your 
own. These questions are only for review and 
will not be graded. Answers may require more 
than one word in some cases. 


When an S had finished with the learning mate- 
rials, his booklet was taken up, time taken to com- 
plete the materials was noted, and a copy of the 
CT given him. The CT was prefaced by a short 
questionnaire requiring some personal data and a 
rating of the difficulty of the reading materials 
and, as result, a period of approximately 5 minutes 
elapsed between the completion of learning and 
beginning on the CT itself, The major purpose of 
the questionnaire was to interpolate a short period 
of time between learning and tested recall, thus 
minimizing the possibility that answers could be 
drawn from the immediate memory store. 


REsvLts 


Mean error scores of the five treatment 
conditions are presented in Table 1. As can 
be seen from the range of scores, the per- 
formance on the criterion measure varied 
considerably as a result of the different 
treatment conditions. In the analysis of 
variance for the four major treatment 
conditions, main effects for both criterion 
relevance and format conditions were sig- 
nificant beyond the .01 level (F = 31.18, 
df = 1/52, and F = 17.89, df = 1/52, re- 
spectively), although there were virtually 
no interaction effects. Contrasts of the indi- 
vidual means (Scheffé, 1959) on the test 
format level revealed significantly fewer 
errors (p < .01) for the CRT than for the 
CIT condition and with the statement for- 
mat, the CRS group showed significantly 
fewer errors (p < .01) than the CIS group. 
Similarly, in comparisons by level across 
the format dimension, CRT scores were 
significantly lower (p < .01) than CRS 
scores and CIT scores significantly lower 
than CIS scores (p < .01). Thus these re- 
sults substantiate those of Rothkopf (1966) 
in showing that such testing within learn- 


TABLE 1 


Mean Error Scorns on THE CT anp Mman Timm 
(in Minutes) SrnnT on THE LEARNING 


MatERIALs 
Errors Time 
Treatment 
Mean SD Mean SD 
CIs 14.93 | 6.18 | 14.50 | 2.68 
CRS 9.07 | 2.86 | 14.42 | 2.67 
CIT 10.50 | 3.01 | 20.79 | 3.91 
CRT 4.57 | 2.62 | 21.93 | 3.61 
RC 14.00 | 3.85 | 11.84 | 1.46 
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ing, even when content tested is unrelated 
to that appearing on the criterion measure, 
independently and significantly improves 
performance on a criterion test. That the 
effects of review and testing may be ad- 
ditive is indicated in the difference between 
the CRT condition and both the CRS and 
CIT conditions, While both criterion-rele- 
vant review by statements and crite- 
tion-irrelevant testing brought about sub- 
stantial gains in learning over the CIS 
condition, their combined effects were sig- 
nificantly greater than either alone. 

A potential source of bias in the analysis 
was an inequality of cell variances. How- 
ever, results from Scheffé (1959) indicate 
that inequalities of the order existing in 
the present data with equal cell sizes 
cause only a slight increase in the proba- 
bility of a Type I error and would not 
appear to nullify the present results since 
all comparisons were significant beyond 
the .01 level. As an additional check, how- 
ever, logarithmic transformations of scores 
were employed in a separate analysis. Re- 
sults from these scores completely sub- 
stantiated those using raw score data with 
effects of format and criterion relevance 
both significant beyond the .01 level (F = 
18.6, 29.2; df = 1/52), together with a 
similar nonsignificant interaction effect, 

The t-test comparisons of the reading 
control group (RC) with the four experi- 
mental conditions at the .01 level revealed 
that all treatment conditions except the 
CIS condition had significantly lower error 
rates on the CT than did the RC group. 

As a comparison of the relative potency 
of the two major treatment variables, cri- 
terion relevance and format, an estimation 
of the correlation between the variables 
and criterion measure, omega squared 
(o?), (Hays, 1963) was obtained. For the 
relevance dimension, w? was .30 com- 
pared with .17 for the format dimension. 
Thus, in terms of explained variance, the 
relevance of review can be seen as con- 
tributing a somewhat larger proportion of 
explained criterion variation than did the 
format manipulations, However, as is evi- 
dent from the results of Table 1, mean dif- 
ferences between the conditions of testing 
without review (CIT) and the condition of 


review without testing (CRS) were slight 
and, of course, nonsignificant. 

An analysis of time scores for the ex- 
perimental conditions revealed a signifi- 
cantly greater (F = 60.3, df = 1/52, 
p < .001) amount of time spent on the 
learning and review materials for test for- 
mat conditions and a predicted nonsig- 
nificant effect on the relevance dimension. 
The increased time spent on the testing 
format conditions was due in part, of 
course, to the time spent in formulating 
and writing out responses, but the signifi- 
cant reduction of errors associated with the 
testing format points to an increased 
amount of time spent in the inspection of 
the prose document (Rothkopf, 1966). 


Discussion 


The strong facilitating effect of adjunct 
testing upon learning found in the present 
study gives additional support to Roth- 
kopf’s (1965) hypothesis that such’ in- 
frequent testing within the learning of 
prose materials may be an important en- 
vironmental control of such learning be- 
haviors. Also, the present results show 
that such testing has facilitative effects 
both in conjunction with and independent 
of the review of the actual content to be » 
learned. In highly factual material of the 
type employed in the present study, inter- 
relationships among content areas were 
minimal and transfer effects from material 
tested within learning on the CI dimension 
and the material appearing on the posttest 
would seem correspondingly very small. 
That such specific transfer effects from the 
‘Srrelevant” review to the criterion test 
were indeed inconsequential is indicated 
in the performance of the CIS group, 
which actually showed a slightly higher 
error rate than did the reading control 
group, which had no review of any kind. 
Thus the positive effects found when re- 
view was irrelevant and test format was 
employed can be attributed only to im- 
provement in learning behaviors associated 
with the prose materials caused by the 
testing itself; increased per-page inspection 
time, more active reading behavior, self- 
prompting and the like, together with any 
generalized reinforcing effects which may 
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have occurred due to self-testing. When 
review was directly relevant to the cri- 
terion test, the significant reduction of 
errors through testing would seem due in 
part to these same general factors. In ad- 
dition, however, the facilitating effects of 
active formulation, rehearsal, and rein- 
forcement of specific responses would ap- 
pear to be operant. 

As has been noted previously (Roth- 
kopf, 1965), the principles of active for- 
mulation and testing of responses within 
learning have been commonly employed 
within the context of programmed instruc- 
tion. However, a very high test/content 
ratio, exists in programmed instruction 
together with rigid behavioral controls in 
contrast to the very low test/content ratio 
and the more informal controls exerted in 
the present prose materials. Nevertheless, in 
spite of the apparently minimal controls 
which are applied, such infrequent testing 
is seen to bring about significant increases 
in the learning from prose documents, at 
least within the experimental context, If 
further research reveals a wider generality 
in its functioning to various grammatical 
and contextual situations, such testing may 


present a possible and perhaps highly prac- 
tical alternative to the lack of effective be- 
havioral control in ordinary reading and to 
the inherent rigidity of programmed in- 
struction. 
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TEXTUAL CONSTRAINT AS FUNCTION 
OF REPEATED INSPECTION 
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College and high school students (N = 155) were exposed to 1 of 2 
experimental passages either 0, 1, 2, or 4 times. After the completion 
of the required number of self-paced inspections, learning (textual 
constraint) was measured by the completion method (Cloze proce- 
dure). Repeated exposure was accompanied. by declines in both 
practice text and completion test inspection times. Proportion of 
correct fill-in responses was found to be an increasing, negatively 
accelerated function of the number of practice exposures. The data 
were consistent with the view that repeated, massed exposures re- 
sulted in progressive modification or extinction of inspection (mathe- 
magenic) behavior. 2 alternative hypotheses were rejected. The com- 
pletion procedure appears to be a simple, quantitative method for 


estimating what is learned from written discourse. 


Textual constraint has been estimated 
by guessing procedures such as those pro- 
posed by Shannon (1951), and others (e.g., 
Burton & Licklider, 1955; Miller & Fried- 
man, 1957). With connected prose the 
guessing units have usually been single 
words which have been deleted from the 
text and which subjects (Ss) are asked to 
replace. This method, which has sometimes 
been called the completion procedure, has 
been used in psychological investigations 
at least since 1897 (Ebbinghaus, 1897; 
for a historical discussion see MacGinitie, 
1960). More recently, it has become in- 
creasingly common to follow the practice 
of Taylor (1954) and refer to this method 
as the Cloze procedure. 

Studies of constraints in language have 
been motivated in the past mainly by 
questions about information processing and 
similar short-term effects of communica- 
tions. Some investigators, for example, 
have been trying to specify how much 
noise a language-encoded message can tol- 
erate before it becomes difficult to under- 
stand (e.g., Miller, Heise, & Lichten, 1951). 
Others have been trying to predict the 
readability of books (Taylor, 1953) or to 
understand the information-handling ca- 
pacity of readers (Pierce & Karlin, 1957). 

Implicit in the notion of textual con- 


1The author is greatly indebted to Esther U. 
Coke for help in the computer analysis of the data. 
Carolyn Shefsky provided valuable assistance in 
all phases of this experiment. 
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straint is the assumption that these con- 
straints reflect learned associations among 
various units of the language chain. These 
learned associations can be thought to be- 
long to two general classes: (a) those re- 
lated to syntax, which are relatively similar 
from passage to passage within a given lan- 
guage; and (b) those related to the guesser’s 
knowledge about the subject matter with 
which the experimental passage is con- 
cerned. The second class of constraining as- 
sociations can be expected to vary widely 
from topic to topic, passage to passage, and 
from reader to reader. 

The hypothesized character of the sec- 
ond source of textual constraint suggests 
that the completion method may provide a 
suitable measure of learning from con- 
nected discourse. The usefulness of the 
completion method in measuring the acqui- 
sition of complex verbal skills is of con- 
siderable interest because of the need for 
simpler quantitative performance measures 
in research on instructional methods and 
on complex learning. 

Systematic studies of the simple effect 
of practice exposures on textual constraint, 
as measured by the completion (Cloze) 
method, do not appear to be available 1 
the experimental literature. Carroll and a8- 
sociates (Carroll, Carton, & Wild, 1959) 
did demonstrate that textual constraint 
Russian text increased after training 1 
Russian, but amount of practice was 00 
systematically varied in their study. King 
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and Cofer (1960) aurally presented two 
short stories six times and found increases 
in Cloze performance that leveled off at 
about 60% correct responses after six pres- 
entations. However, the effects of exposure 
to the experimental passage and repeated 
completion testing were completely con- 
founded in their experiment. As a conse- 
quence it was not possible to determine the 
simple effect of text exposure. 

The purpose of the present experiment 
was to determine changes in textual con- 
straint that result from practice exposures 
when the confounding effects of inter- 
mittent testing have been eliminated. The 
completion method was again used to pro- 
vide a measure of constraint. The study 
was carried out under conditions in which 
exposure time on any one presentation of 
the experimental text was under the con- 
trol of S. This was done because people 
usually control their own exposure time 
during the study of written text. There- 
fore frequency of exposure with self-pacing 
appeared a more interesting and useful in- 
dependent (practice) variable than expo- 
sure frequency with time per exposure held 
constant. The use of self-paced practice 
was also partly motivated by interest in 
inspection time as a dependent measure. 


MerHop 


Materials 


Two passages, one on leather making (approxi- 
mately 1500 words, excerpted from Parker, 1911,7) 
and one on the history of Australia (approximately 
750 words, excerpted from MacInnes et al., 1964, 
pp. 23-24") were used. Practice and completion test 
material were mimeographed. Since “substantive” 
or “content” learning was the focus of interest, 
only nonfunction words were deleted in the com- 
pletion test. With this restriction, approximately 
one word in ten was removed by an otherwise 
unbiased algorithm. A line of uniform length was 
put in place of the deleted word. 


Design and Procedure 


Each § was exposed to only one of the two ex- 
perimental passages, either 0, 1, 2, or 4 times. For 


* This excerpt was used with the kind permission 
of the Encyclopaedia Brittanica, 425 North Michi- 
gan Avenue, Chicago, Illinois. 
was used with the kind permis- 
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Fic. 1, Proportion of correct responses on the 
completion test as function of number of expo- 
sures to the experimental passage. 


the leather passage, numbers of Ss for the four ex- 
posure frequencies were respectively 16, 17, 18, and 
17. For the text on Australia these numbers were 
22, 24, 21, and 20. Self-pacing and group procedures 
were used throughout. Whenever 8 completed 
an inspection of the experimental passage, he sig- 
naled the experimenter, who then collected the 
text and immediately provided the document 
which was required for the next phase of the ex- 
periment. 

Approximately 10 minutes after the completion 
of the predetermined number of inspections of the 
passage, Ss were given the completion test, During 
the 10-minute delay, Ss completed a personal 
questionnaire. 

The Ss were instructed to record the time they 
started and finished each page of text. They also 
recorded these times for the completion test. Time 
was recorded from a digital clock which was pro- 
jected on a screen, 


Subjects 


All Ss who read the leather passage were paid 
volunteer undergraduate students from Fairleigh 
Dickinson University in Madison, New Jersey. 8o 
were half of the Ss reading the text on Australia, 
The remaining Australia Ss were paid volunteers 
from Grades 11 and 12 of Cranford High School, 
Cranford, New Jersey 


Rusvuts AND Discussion 


Correct Completion Responses 

Proportion of correct responses over all 
items on the completion test is shown for 
the various treatments in Figure 1, A re- 
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Fic. 2. Average inspection time per page of text 
for the several exposures. (Only data from the 4- 
exposure groups are shown. The lines were fitted 
by the method of least squares, 7 = exposure.) 


sponse was scored correct if it was identi- 
cal to the word which had been deleted 
from the text. Minor variations in spelling 
were accepted as correct. Correct responses 
increased as a function of repeated expo- 
sure to the passage but the learning curve 
was negatively accelerated and leveled off 
after two inspections. The plots in Figure 
1 are quite similar to the acquisition curves 
which have been obtained from averaged 
group data for many other learning tasks. 
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Fic. 8. Proportion of correct responses on the 
completion test as function of the cumulative time 
spent in inspecting the passage. (Time is shown 
in multiples of inspection time on Exposure 1—see 
text for explanation.) 


Exnst Z. Rotuxorr 


The rise in textual constraint with re- 
peated exposure was tested by analysis of 
variance. The results were significant for 
both the Leather (F = 26.53; df = 3/64; 
p < .001) and the Australia passages (F = 
39.07; df = 3/83; p < .001). 


Inspection Time 


One possible account for the negatively 
accelerated form of the acquisition curves 
in Figure 1 is that it is related to the de- 
creases in inspection time which occurred 
at each successive exposure. Decline of in- 
spection time with repetition or prolonged 


reading have been observed previously — 


(Premack & Collier, 1966; Rothkopf & 
Bisbicos, 1967). Average inspection time 
per page is shown in Figure 2 for Ss who 
were exposed to the experimental passage 
four times. The drop in inspection time was 
approximately linear and, as tested by 
analysis of variance, with repeated obser- 
vations on the same Ss, was significant for 
both the Leather (F = 8.73; df = 3/45; 
Pp < .01) and the Australia (F = 3.32; df 
= 3/36; p < .05) passage. Only those Ss 
for whom inspection times were available 
for all pages over all four readings were 
used for these analyses. 

However, the gain from successive in- 


spections declines at a greater rate than in- — 


spection time. This is illustrated in Figure 
8, where proportion of correct responses on 
the completion test was plotted as a func- 


tion of cumulative inspection time through — 


four exposures. Because of differences 
among treatment groups in average inspec- 
tion time on the first exposure to the pas- 
Sage, cumulative inspection time was plot- 
ted as a multiple of the amount of time 
spent on the first reading. It is quite clear 
from inspection of Figure 3 that less and 
less learned constraint resulted from suc- 
cessive inspection time units as Ss were Te 
peatedly exposed to the experimental pas- 
sages. 

The results shown in Figure 3 indicated 
that the amount of learning produced by 
each unit of inspection time decreases 
throughout the several inspections. The ob- 
served decline in inspection time over eX 
posures therefore cannot provide a z 
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‘ cient account for the leveling off of the 


acquisition curves in Figure 1. Another ex- 
planation may be that the amount learned 
at each successive inspection is propor- 
tional to the amount which has yet to be 
learned. Conceptual models which employ 
the proportionality notion have been used 
to account for the negatively accelerated 
acquisition curves of several learning tasks 
(e.g., Estes, 1950; Hull, 1943, pp. 102-123). 
One of the consequences of such a model is 
that learning curves for items of various 
levels of difficulty should differ markedly 
from each other in slope. This prediction 
was tested, for the Leather passage,® by 
subdividing the 123 items of the Leather 
completion test according to po, the initial 
guessing difficulty, that is, percentage cor- 
rect response for Ss who did not see the 
passage prior to the completion test. Four 
different levels were used: po = O(N = 
45), 0 < po < 0.2(N = 29), 02 < wo < 
0.4(N = 18), po > 0.4(N = 81). Pro- 
portion of correct completion responses as 
a function of number of inspections prior 
to testing is shown for these four categories 
of items in Figure 4. The four curves were 
nearly parallel. Each of the four classes of 
items gained, on the average, about the 
same number of correct responses per ex- 
posure, 

The present analysis therefore did not 
support interpretation of slowing of the 
learning rate in terms of approach to a 
common theoretical maximum constraint 
for all items such as “mastery.” The ad- 
ditional assumption that theoretical max- 
ima differ for various deletions in the 
passage, however, may serve to bring the 
proportionality hypothesis in line with the 
data. This is not an unreasonable assump- 
tion since it is well known that there are 
large differences in ambiguity among var- 
ious syntactic and semantic constructions 
(e.g., Aborn, Rubinstein, & Sterling, 1959; 
Coleman, 1965). 

A more likely explanation may be based 
on the modification or extinction of atten- 
tion-like processes as a function of re- 


“The Leather passage was used exclusively be- 
cause the number and distribution of data points 
for the Australia passage was not sufficient. 
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Fia. 4. Proportion of correct responses on the 
completion test as function of number of exposures 
to the experimental passage for test items of 
various 0-exposure guessing probabilities (Po). 
(The data are from the Leather passage.) 


peated or prolonged inspection (see Roth- 
kopf, 1966). Learning from written 
materials depends critically on a class of at- 
tention-like responses, called mathema- 
genic behaviors or inspection behaviors 
(Rothkopf, 1963, 1965). The extinction or 
modification of these attention-like proc- 
esses by repeated, massed inspection would 
result in the diminishing learning effects. 
The decline in inspection time, according 
to this interpretation, is an indicator of the 
successive changes in mathemagenic be- 
havior. More direct tests of this hypothe- 
sis are now underway. 


Completion Test Time 

Average test time per page of the com- 
pletion test was plotted as a function of 
number of inspections of the text (Figure 
5). Test time generally decreases for both 
passages. The data are well fitted by the 
line 7 = 283.57 — 21.04 B for Australia, 
where 7 is test time and EZ is exposure. 
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Fic. 5. Average time per page of the com- 
pletion test as function of the number of exposures 
to the experimental passage. 


The basis for the decline in test time can- 
not be decided from the present results. It 
may be due to (a) decreased latencies in 
filling in the blanks, or (b) decreased time 
in inspecting the remaining reading ma- 
terials in the test, or both. The first effect, 
that is, a decrease in response latency, has 
been frequently observed to be associated 
with rises in the probability of correctly 
learned responses (e.g., Hull, 1943, pp. 336- 
337). It has also been specifically observed 
in connection with increases in verbal con- 
text (Treisman, 1965). The second possi- 
bility is that test times decrease as func- 
tion of exposure to the practice passage 
because the reading material in the test is 
inspected more hurriedly. This suggests 
that the decreases in inspection time of the 
practice and the test material may have a 
common basis. 

Both of the factors described above may 
contribute to the decline in the time it 
takes to complete the completion test. The 
present data do not rule out the conclusion 
that repeated exposure to an experimental 
passage results not only in more hurried 
inspection of the practice passage but also 


of the test material. The possibility there. ; 
fore exists that the negative acceleration of 
the constraint learning curve was at least 
in part due to changes in test inspection 
behaviors rather than modification of 
mathemagenic behavior during practice. 


REFERENCES 


Asorn, M., Rusunsten, H., & Sreruina, T. D, 
Sources of contextual constraint upon words in » 
sentences. Journal of Experimental Psychology, 
1959, 57, 171-180. 

Burton, N. G., & Licxuier, J. C. R. Long-range 
constraints in the statistical structure of printed 
English. American Journal of Psychology, 1955, 
68, 650-653. 

Carrot, J. B., Carton, A. 8., & Win, C. P. An 
investigation of “cloze” items in the measure- 
ment of achievement in foreign languages. 
Cambridge: Laboratory for Research in Instruc- ” 
tion, Graduate School of Education, Harvard 
University, 1959. 

Coteman, E. On understanding prose: Some de- 
terminers of its complexity. Unpublished report, 
New Mexico State University, 1965. | 

Exssrvcuaus, H. Uber eine neue Methode mr 
Priifung geistiger Fihigkeiten und ihre An- 
wendung bei Schulkindern. Zeitschrift fiir Psy- 
chologie, 1897, 13, 407-424. 

Estes, W. K. Towards a statistical theory of learm- 
ing. Psychological Review, 1950, 57, 94-107. 

Hou, C. L. Principles of behavior. New York: 
Appleton-Century-Crofts, 1943. 

Kina, D. On the accuracy of written recall; a 
scaling and factor analytic study. Psychological 
Record, 1960, 10, 113-122. 

Kine, D., & Corsr, C. N. Exploratory studies of 
stories varying in the adjective-verb quotient, 
Journal of General Psychology, 1960, 62, 19% 
221. 

MacGmurm, W. H. Contextual constraint in Eng- 
lish prose. Unpublished doctoral dissertation, \ 
Columbia University, 1960. 

MacInnes, C., et al. Australia and New Zealand, 
Life World Library, New York: 1964. i 

Miter, G., & Frrepman, E. A. The reconstruction 
of mutilated English texts. Information and 
Control, 1957, 1, 38-55. 

Mutuer, G. A., Hetsz, G. A., & Licuten, W. The 
intelligibility of speech as a function of the con- 
text of the test materials. Journal of Experr 
mental Psychology, 1951, 41, 329-335. } 

Parker, J. G. Leather. In The encyclopaedia 
Britannica, (11th ed.). Cambridge, England: 
Cambridge University Press, 1911. Vol. XVI. Pp. 
330-345, 


Prence, J. R., & Kartin, J. E, Reading rates and 
the information rate of a human channel. Bell 
System Technical Journal, 1957, 36, 497-516. 

Premack, D., & Cottier, G. Duration of looking 
and number of brief looks as dependent vat 
ables. Psychonomic Science, 1966, 4, 81-82. 


TextuaL Constraint AS Funcrion or Repeatep Inspection 25 


_ Roraxorr, E. Z. Some conjectures about inspec- 
tion behavior in learning from written sentences 
and the response mode problem in programed 
self-instruction. Journal of Programed Instruc- 
tion, 1963, 2, 31-46. 

Roruxorr, E. Z. Some theoretical and experimen- 
tal approaches to problems in written instruc- 
tion. In Krumboltz, J. D. (Ed.), Learning and 
the educational process. Chicago: Rand Mc- 
Nally, 1965. Pp. 193-221. 

Rornxorr, E, Z. Learning from written material: 
‘An exploration of the control of inspection be- 
havior by test-like events. American Educational 
Research Journal, 1966, 3, 241-249. 

Rornxorr, E. Z., & Brssicos, E. E. Selective fa- 
cilitative effect of interspersed questions on 


learning from written material. Journal of Edu- 
cational Psychology, 1967, 58, 56-61. 

Suannon, C. E. Prediction and entropy of printed 
English. Bell System Technical Journal, 1951, 
30, 50-64. 

Taytor, W. L. “Cloze procedure”; A new tool for 
measuring readibility. Journalism Quarterly, 
1953, 30, 415-433. 

Taytor, W. L. Application of “cloze” and entropy 
measures to the study of contextual constraint 
in samples of continuous prose. Unpublished 
doctoral dissertation, University of Illinois, 1954. 

Trersman, A. M. Effect of verbal context on 
latency of word selection. Nature, 1965, 206, 
218-219. 


(Received January 30, 1967) 


‘ducational 


Journal of Ee 
1968, Vol. 59, No. 


ional Peychology 
126-31 


GRADE LEVEL, SCHOOL STRATA, AND 
LEARNING EFFICIENCY’ 


WILLIAM D. ROHWER, JR., STEVE LYNCH, JOEL R. LEVIN, 
ano NANCY SUZUKI? 
University of California, Berkeley 


A 4-way design was used to evaluate the facilitory effects of sentence 
verbalization and action depiction on the learning of paired asso- 
ciates by Ist-, 3rd-, and 6th-grade children from high- and low-strata 
schools. Each of a total of 432 Ss learned a list of 24 pictures of paired 
objects presented on movie film by a pairing-test procedure for 2 
trials. The Ist of 2 experimental variables, Verbalization, concerned 
the type of verbal description given for the pairs (names, phrases, 
sentences). The 2nd experimental variable, Depiction, contrasted 2 
kinds of pictorial representations of the pairs, one of which was a 
visual translation of the name and phrase descriptions (still) and 
the other of which was a visual translation of the sentence descrip- 
tions (action). The amount learned by older Ss was greater than 
that learned by younger Ss regardless of school strata, Sentence 


descriptions and action depiction facilitated learning for all Ss, and, 
in all conditions, low-strata children learned as efficiently as high- 


strata children. 


The present experiment was performed 
in order to evaluate hypotheses suggested 
by the juxtaposition of two rather dispar- 
ate topics of current research interest: The 
improvement of learning efficiency, and 
group-related differences in learning effi- 
ciency. Concern with the former topic is 
well illustrated in the work of Davidson 
(1964), Jensen and Rohwer (1965), and 
Reese (1965) on conditions for the fa- 
cilitation of paired-associate (PA) learn- 
ing in children, Two kinds of facilitory 
conditions have been isolated: verbal and 
pictorial. Jensen and Rohwer found that 
the acquisition of a list of Paired pictures 
by second-, fourth-, and sixth-grade chil- 
dren was markedly accelerated by the in- 
struction to form and utter a sentence con- 
taining the names of the two objects in 
each pair. In this and a number of subse- 
quent experiments (Rohwer, 1966; Rohwer 
& Lynch, 1966; Rohwer, Shuell, & Levin, 
1968) the facilitory effect of sentence con. 
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texts has been clearly demonstrated. On 
the pictorial side, Davidson (1965) has 
shown that the learning of paired objects 
represented by line drawings is determined 
by the spatial configuration of the two 
members in a pair. When the two objects in 
each pair were depicted independently, the 
amount learned was notably smaller than 
when the two objects were joined to one 
another visually (e.g., a picture of a chain 
and a bowl, versus a picture of a chain ina 
bowl). Reese (1965) found that verbal de- 
scriptions of relationships between objects 
as well as pictorial representations of such 
relationships facilitated PA learning in 
young children. ‘ 
The samples of children involved in all 
of the experiments reviewed thus far were 
drawn from schools in areas populated by 
middle- or upper-income groups. This fact 
is noteworthy in connection with the sec- 
ond topic of pertinence to the present 
study, namely, group-related differences in 
learning proficiency. It has been shown Te- 
peatedly that when learning proficiency is 
Measured in terms of performance 0D 
standardized tests of school achievement 
or on commonly used tests of intelligence; 
children from schools serving middle- and 
upper-income populations are superior to 
children from schools serving lower-income 
Populations (e,g., Brown & Deutsch, 1965; 
Wilson, 1963). It remains to be established 


whether or not the deficiencies in what and 
how much children from low-strata schools 
have learned are related to concomitant 
deficiencies in the perfomance of such 
children on tasks that demand new learn- 
ing, Results reported by Semler and Iscoe 

(1963) suggest that such a relationship 

may indeed hold. On a PA task, 5- and 
: 6-year-old white children learned more ef- 

ficiently than Negro children from rela- 
__ tively lower-strata schools. No differences 
were found for older children from the 
two groups. 

One of the purposes of the present exper- 
iment was to assess the generality of the 
Semler and Iscoe findings for a different 
PA task and for groups distinguished pri- 
marily in terms of school strata rather 
than in terms of race. A second purpose 
was to determine whether or not the de- 
ficiency in PA learning expected to appear 
among young children from low-strata 
schools could be ameliorated by presenting 
PAs under conditions known to facilitate 
learning in children drawn from upper- 
strata schools. 


Mernop 


Subjects 


The total sample of 432 children was drawn 
from three grade levels (first, third, and sixth) in 
two kinds of schools distinguished by the char- 
acteristic performance of their students on stand- 
ardized tests of achievement and aptitude. Half 
the subjects (Ss) were drawn from schools where 
test performance was low. Available information 
about the six populations from which the samples 
were selected is presented in Table 1. In addition 
to discrepancies in test performance, the high- and 
low-strata school populations differed in other 
ways associated with the distinction between “ad- 
vantaged” and “disadvantaged” areas. For ex- 
ample, the modal occupational category of fathers 
of students in the high-strata schools was pro- 
fessional whereas that of fathers of students in the 
low-strata schools was semi-skilled or unskilled 
manual, In sum, the two populations were selected 
because of the contrast between them with re- 
Spect to the learning proficiency of their students 
as assessed by standardized test performance and 
with respect to other characteristics often pre- 
sumed to be related to the success of children in 
school learning. 

From the total population of children within 
each grade level of the high- and low-strata 
schools, 72 Ss were selected and assigned ran- 
domly to one or another of the six experimental 
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TABLE 1 
Mean Crronotoaican Agus AND STANFORD 
AcmEvemEent Test GrapE-EQuivaLENT 
QUARTILES For Grapus 1, 3, AND 6 
or THE Two ScHoot-Srrata 
PopuLaTions 


Primary I, Form W 


Mean CA 
Schoo} strata | Mean C 
Grade 1 
High 6.60 
6.98 
Primary I, Form W 
Word meaning Pangan 
Q] Qa] a] a] @ 
iat 8.57 
Low 8.97 | 1.7] 2.0] 27] 1.6] 1.9] 25 
Intermediate II, Form W 
Word Meaning | Paragraph 
Q | Qa] QG] Gy] Q@)a 
Grade 6 
a 7 | 6.9 | 8.4] 6.6 | 7.8 | 8.1 
a wos | 8 | 24 wBi| 82 | 42 | 48 


Note.—No data are available for the Grade 3 high-strata 


conditions such that each cell of the design was 
comprised of 12 Ss. 


Materials and Design 


In addition to grades and school strata, the 
principal factors in the 3 X 2 X 8 X 2 factorial 
design were Verbalization (names versus phrases 
versus sentences) and Depiction (still versus ac- 
tion). All Ss were asked to learn the same list of 
24 pairs of familiar objects presented pictorially 
by a pairing-test method. The three Verbaliza- 
tion conditions differed only with respect to the 
character of the sepa ere utterances 
during the pairing trials. el pair was pre- 
sented, EZ, using a prepared script, read either the 
names of the two objects, (e.g., “dog. ... gate”), a 
phrase containing the names of the two objects 
(eg, “The dog and the gate.”) or a sentence 
containing the names of the two objects 
(eg, “The dog closes the gate.”). A complete list 
of the verbal materials appears in Table 2. The 
comparison of principal interest was that between 
name and sentence conditions but since the presen- 
tation rate was constant for all conditions, the 
phrase condition was included to control for po- 


| 


| 
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i i ials were the same fy! 
LE 2 tions. The test-trial material f i 
TAB. speeel i iS both Depiction conditions, that is, they con. 
Pe ee sisted of 24 4-second segments of film bearing th 
images of the first-named objects in every pair, 
nese A pak ae Two different random orders of pairing-trial an) 
fork (or/cuts) the cake of test-trial materials were formed so that no order, 
Fy Towel Piste The paar ) the plate. was repeated during the course of the two com 

- Cat Lon ahoGe (r/inmpe) the lo plete trials given toallSs. : 
4, Man Pole The bet eed trikes) tie eup. Tn addition to the four principal factors in the 
8. Shoe Chatr hershiog (onset or design, the experiment was entirely balanced with 
z Hand Hat The band (or/throws) the hat respect to experimenters, of whom there were two, 

e . 
_B; Rook Bottle Tie cooky chee th wagon. both male graduate students. 
E e Tope e 
i: NeoduBatoon | The ml Gy/pomm) the albon. Procedure 
log Gat " Usk 

Ib, Bee Bed” Fe ee eer the ber The task was administered to Ss individually fa. 
16, Ax Wood The ax (an ita) wand a total of two pairing and two test trials. Instru., 
1b: Blanket ree the Haka evonen) the tree tions informed the Ss as to the procedure thai, 
Sean he sails (ans) She bow. would be followed and asked them to study each 
21. ona BS ‘The hammer (or/pulls) 0 bell. of the pictures of objects in order to remembet 
oi Ren Soe at fol and pena the book. which ones were presented together. One or mott | 
i Poot oe The foot er ) the house. examples were given orally without pictorial sup 


tential differences in rehearsal time during the pair- 
ing trials. Previous research (Rohwer, 1966) has 
indicated that phrases similar to those shown in 
Table 2 are adequate for this purpose since they 
do not bias performance while still providing a 
grammatical context to fill the pairing interval. 
Note that in the materials for the phrase condi- 
tions all of the forms that serve as connectives be- 
tween the noun pairs are conjunctions and that 
only two of these are used such that each one is 
repeated 12 times in the list. Since there are only 
two repetitions of connective words in the sen- 
tence materials, it might be supposed that what- 
ever performance differences emerge between the 
two conditions should be attributed to differences 
in intralist similarity rather than to differences 
in the facilitory efficacy of sentences and conjunc- 
tion phrases. The results of an experiment per- 
formed to test this supposition, however, clearly 
indicate that it is false. Rohwer and Lynch (1967) 
found that sentences produce PA performance su- 
perior to that produced by conjunction phrases 
even when the numbers of connective repetitions 
were equated for the two kinds of context. 

The second experimental factor, Depiction, con- 
sisted of two levels that differed with respect. to 
whether the object pairs were presented in a man- 
ner consistent with the name and phrase verbali- 
zations in the one case (still) or in a manner con- 
sistent with the sentence verbalizations in the 
other case (action). In the materials for both De- 
Piction conditions, the pairs of objects were photo- 
graphed against a background of gray cloth and 
their images were recorded on 16-mm. black-and- 
white movie film. For the still condition, the two 
objects in each pair were simply placed side by 
side on the set and photographed for 4 seconds, 
For the action condition, the pairs of objects were 
photographed while involved in the episodes de- 
scribed by the corresponding sentence verbaliza- 


port until Ss appeared to understand the task 
During both pairing trials, the appropriate pie 
torial materials were presented on a beaded screen 
by means of a movie projector, and, as each pait 
appeared, # read aloud the appropriate verbalize 
tion. On the test trials, Z uttered the name of each 
object when it appeared on the screen and me 
corded Ss’ responses which were made orally 
On pairing and test trials, each item was visible 
for 4 seconds, the interitem interval was 1 second, 
and the intertrial interval was 4 seconds. f 


Resvits | 


Learning was measured in terms of the 
total numbers of correct responses made 0! 
the two test trials. The mean numbers of 
correct responses obtained by the two Hs 
were very close (32.09 versus 31.61) and 
since a simple analysis of variance revealel_ 
that the difference was not reliable (F < 
1) this variable was ignored in the remail- 
ing treatment of the results. | 

A four-way analysis of variance was a)" 
plied to the data produced by the fae 
torial design. As expected, the main effet! 
of grades was significant (F = 20.51, df = 
2/396, p < .01) such that the amoutt 
learned by sixth graders (33.92) was larg@! 
than that learned by third graders Gi ! 
which, in turn, was larger than that learne? 
by first graders (29.48). The variance | 
sociated with school strata, however, W% 
not significant (F < 1); as an inspectid? 
of Table 3 suggests, the average perfom™ 

"Unless otherwise indicated in the text, # 


post hoc comparisons were made by means of the 
Scheffé method at p = .05. 
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ance of children from low-strata schools 
was virtually the same as that of children 
from high-strata schools. The main effect 
of Verbalization was significant (F = 
17.58, df = 2/396, p < .01) and a com- 
parison of the three component means re- 
vealed that the effect was comprised en- 
tirely of the superiority of the sentence 
condition over both the phrase and the 
name conditions. Within Depiction, action 
was associated with more correct responses 
than was still (F = 108.56, df = 1/396, 
p< .01). 

Within the analysis of variance, the 
tests that are critical for an evaluation of 
the experimental hypotheses are: The in- 
teraction of Strata x Verbalization Xx 
Depiction; and, the interaction of Grades 
x Strata x Verbalization x Depiction. 
Before the results of these tests are re- 
ported, consider the core interaction, Ver- 
balization x Depiction, ignoring the clas- 
sification variables of strata and grades. 
The relevant means are presented in Table 
3. In agreement with our expectations, this 
interaction was significant (F = 14.51, df 
= 2/396, p < .01) and it was located 
entirely in the difference between those 
conditions designed to be facilitory and 
those not so designed. That is to say, the 
amount learned in the name-still and the 
phrase-still conditions was significantly 
smaller than the amount learned in any 
one of the sentence or action conditions. 
None of the other pair-wise contrasts was 
significant. Sentence verbalizations and 
visual translations of those sentences pro- 
duced equivalent levels of performance. 
Similarly, name and phrase conditions were 
associated with equal amounts of learning, 
indicating that the increased opportunity 
for rehearsal provided in the former was of 
no advantage. 

Turning now to the interactions of eriti- 
cal interest, the analysis revealed that 
neither was significant. The form of the 
interaction of Verbalization x Depiction 
was comparable for all Ss, whether they 
were sampled from low-strata schools or 
from high-strata schools (F = 2.67, df = 
2/396, 05 < p < .10). An examination of 
the means shown in Table 3 suggests, and 
an application of the Scheffé method con- 


TABLE 3 
Mzan Noumpers or Correct RmsPponsms AS A 
Function or Strata, DmpicTIoN, AND 


VERBALIZATION 
Verbalization 
Depiction | School 
Name | Phrase | Sentence | Total 
Still High | 27.36 | 26.66 | 83.14 | 29.04 
Low 26:48 | 25.92 | 33.72 | 28.70 
Subtotal | 26.92 | 26.29 | 33.43 | 28.88 
Action High | 38.00 | 35.08 | 35.72 | 34.60 
Low 36.48 | 34.30 | 84.36 | 35,04 
Subtotal | 34.74 | 34.69 | 35.04 | 34,82 
Total | 30.83 | 30.49 | 34.24 | 81.85 


MSx (896) = 35.15. 


firms, that the only difference between 
strata lies in the marginal superiority of 
the low-name-action group over the high- 
name-action group (.05 < p < .10). Con- 
trary to our prediction, Ss from low-strata 
schools performed no less well than Ss from 
high-strata schools in both the customary 
conditions of PA learning and in the 
facilitory conditions. Furthermore, the 
form of this three-way interaction was 
comparable for all grades, that is, the four- 
way interaction was not significant (F < 
1). In sum, the present experiment pro- 
duced no evidence in support of the asser- 
tion that children from low-strata schools 
learn PAs less efficiently than children 
from high-strata schools. 

The only other significant term in the 
analysis of variance was the interaction of 
Grades X Verbalization (F = 3.70, af = 
4/396, p < .01). The form of this interac- 
tion was such that the difference between 
name and sentence conditions was larger 
in the first than in the third and sixth 
grades. The only supportable interpreta- 
tion of this interaction, in view of the fact 
that the form of the Verbalization x De- 
piction interaction was equivalent for all 
grades (F = 1.26, df = 4/396, p > 25), 
is that the facilitory effects of sentence 
verbalization were obscured in the higher 
grades by the effects of the depiction con- 
ditions. 

Discussion 


The relatively high degree of learning 
proficiency observed among children from 
low-strata schools is at once the most 
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puzzling and the most promising aspect of 
the present results. Evidence available be- 
fore the learning task was administered led 
to the expectation that such children would 
be distinguished by their inability to en- 
gage successfully in new learning. If per- 
formance on standardized tests of school- 
related achievement is taken as an index of 
how much children have been able to learn 
up to the time of examination and if, as 
was the case in the present study, two 
groups of Ss are equated for chronological 
age but still differ markedly in their test 
scores, a possible inference is that the 
members of the low-scoring group are de- 
ficient. in learning ability. Obviously, this 
argument is too simple-minded in the sense 
that equivalence of chronological age does 
not necessarily imply equivalence of op- 
portunity for relevant learning. Neverthe- 
less, the teachers of the children from the 
low-strata schools corroborated the sim- 
plistic inference indicated by standardized 
test performance in describing their stu- 
dents as being slow to learn and difficult 
to teach. Furthermore, the performance of 
low-strata children on school-related tests 
of achievement is predictive of subsequent 
success in school learning. Thus it seems 
unwarranted to conclude that standardized 
test performance is unrelated to learning 
proficiency and yet the results of the pres- 
ent experiment seem to imply just this 
conclusion, 

Two interpretations of the discrepancy 
between test and learning task perform- 
ance remain to be considered. The first is 
that PA learning is unrelated to the kinds 
of learning in which a child must engage 
in order to perform successfully in school 
and on tests of school achievement. Al- 
though this interpretation cannot be dis- 
counted, we are inclined to dismiss it on 
the assumption that a careful description 
and analysis of the kind of learning teach- 
ers attempt to induce in students, es- 
pecially in first-grade curricula, would re- 
veal numerous instances of similarity to 
the PA paradigm. A second, and, in our 
view, a more likely interpretation of the 
discrepancy is that it occurs because of 
pronounced differences between the condi- 


tions of learning that are characteristic of ‘ 
the classroom and those that are charac. 
teristic of the laboratory. 

In brief, three kinds of such differences 
may be distinguished. First, greater con- 
trol of the focus of the child’s attention is 
achieved in the laboratory than in the 
classroom by (a) administering the learn- 
ing materials individually rather than to, 
groups, and, in the special case of the pres- 
ent study, by (b) presenting the elements 
to be learned in a form that elicits the at- 
tention of the child. Second, the require- 
ments of the child’s task are explicitly de- 
tailed to a much greater extent in the 
laboratory than in the classroom. Third, in 
the laboratory case, the information neces- 
sary for the child to make a judgment ’ 
about the adequacy of his performance is 
inherent in the learning materials them- 
selves, whereas in the classroom such in- 
formation is typically made available only 
in the teacher’s reaction to the child’s be- 
havior and not within the boundaries of 
the task itself. Whether or not these dif- 
ferences between the conditions of learning 
in the classroom and in the laboratory are 
responsible for the discrepancy between the 
performance of low-strata children on 
standardized tests and their performance 
on learning tasks, it should be noted that 
the higher incidence of success in the 
laboratory than in the classroom, at least 
in the present study, may itself reinforce 
the behaviors that lead to efficient learn- 
ing. 1 

Aside from the foregoing remarks that 
are clearly and admittedly speculative, the 
results of the present experiment demon- 
strate empirically that the efficiency with 
which children, whether they are draw? 
from low- or high-strata schools, learn PAS 
can be notably affected by the manner in 
which the items are presented. Both the 
verbal condition of sentence contexts and 
the pictorial condition of action episodes 
proved facilitory for all groups of Ss. Thus 
the relevance of the results to the prob- 
lems of the design of educational proce- 
dures and materials is by no means con 
fined to upper-strata populations. 

The present results diverge from those 
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reported by Semler and Iscoe (1963) only 
with regard to the performance of the first- 
grade or 6-year old samples. This diver- 
gence may be attributable either to task 
differences or to differences in the way 
populations were defined in the two stud- 
ies. More specifically, one possibility is 
that the method of presenting learning 
materials used in the present experiment 
may have elicited more constancy of at- 
tention from low-strata 6-year olds than 
that used by Semler and Iscoe. Further- 
more, attention was only required for the 
duration of two trials in the present study 
in contrast to the 12 trials administered in 
the previous one. 
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RELATIONSHIPS BETWEEN TRAINING METHODS 


AND LEARNER CHARACTERISTICS’ 


G. K. TALLMADGE? 
American Institutes for Research in the Behavioral Sciences 
Palo Alto, California 


The aim of this project was to determine whether training effective- 
ness could be increased by employing training methods which differed 
as a function of trainee characteristics, A study was designed in- 
volving a control and 2 experimental training methods and 16 
measures of trainee aptitudes and interests. The experimental train- 
ing methods were designed to reflect Gagné’s (1965) Type 3 (Chain- 
ing) and Type 7 (Principle Learning) theoretical constructs. Large 
achievement differences resulted from the 3 training methods. No 
interactions between training methods and learner characteristics 
were found, however, either with single aptitude measures, with 
combined measures, or by means of covariance analysis, It was con- 
cluded that these negative findings resulted from the existence of 
interactions between subject matter content and training methods. 


The entire field of mental testing has 
grown out of an awareness that indi- 
viduals differ in aptitudes, interests, and 
personality traits. Educational practice, 
however, has traditionally viewed these 
differences as something of an incon- 
venience, and has only recently recognized 
the potential advantages of individualized 
instruction. 

There are several different ways in which 
instruction can be designed to accommo- 
date for individual differences. To date, 
most research efforts in this area have been 
concerned with accommodating for indi- 
vidual differences in ability level and have 
employed such techniques as branching 
auto-instructional programs, self-paced 
learning, and others. Some research has 
also been conducted investigating relation- 


*This report is based on research conducted 
under contract N-61339-66-C-0043 with the Naval 
Training Device Center, Port Washington, New 
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is granted for reproduction, translation, publica-~ 
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of the Navy Department or the Naval Training 
Device Center. 
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sistance of J. W. Shearer in planning and conduct- 
ing the research described herein, and of B. J. 
Anderson, M. A. Chapman, 8. R. Ford, and J. V. 
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ships between types or profiles of learner 
aptitudes and training methods. It is this 
latter area which was of concern in the 
research reported here, 

Several investigators have reported sig- 
nificant interactions between instructional 
method variables and learner character- 
istics but, for one reason or another, these 
studies have all been inconclusive. Edger- 
ton (1958), for example, found a signifi- 
cant positive correlation between a word 
fluency test and achievement in an ait- 
craft familiarization course taught by 4 
“rote” method, but no correlation between 
the same measures when the course was 
taught so as to emphasize understanding. 
Similarly, he found a significant negative 
relationship between a memory test and 
achievement in the same course taught by 
the understanding method, but no relation- 
ship between these measures when the 
course was taught by rote methods. He 
concluded that trainees with high word- 
fluency scores should be taught by the rote 
method, and those with low memory scores 
should be taught by the “why” method. He 
failed to consider in this conclusion, how- 
ever, either the correlation between the two 
aptitude measures or, more importantly, 
the fact that the “why” training method 
produced significantly higher achievement 
on an overall basis, F 

Bush, Gregg, Smith, and McBride 
(1965) treated the aptitude measures 
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* their study in a sound manner, but con- 
founded training method and subject mat- 

ter content in such a way that it was im- 
possible to determine whether reported 

| interactions were between learner character- 
istics and training methods or between 

 Jearner characteristics and course content. 
The latter alternative appeared, perhaps, 
more likely. 

Other studies could be referenced and 
reasons cited for questioning the apparent 
interactions reported between instructional 
methods and learner characteristics. The 
major difficulty appears to be that few 
studies have been designed specifically to 
assess interaction effects. Where the pri- 
mary research goal has been to assess the 

‘ overall effectiveness of different instruc- 
tional treatments, evaluation of apparent 
interactions has typically been hampered 
by inadequate experimental or statistical 
controls. 

The study reported here was designed 
specifically to test interactions between 
trainee characteristics (aptitudes and in- 
terests) and training methods. 


Meruop 


The study reported here was conducted in the 
setting of the United States Navy Radarman 
Class A (RD/A) School at Treasure Island, Cali- 
fornia. A 1-week segment of the 16-week RD/A 
curriculum which covered maneuvering board 
topics was selected as the research vehicle for the 
study. This choice was based on practical as well 
as theoretical reasons which are discussed elsewhere 
(Tallmadge & Shearer, 1967) and which are too 
lengthy to repeat here. 

An analysis was made of course objectives, and 
statements of specific behavioral objectives were 
formulated. Based on these behavioral objectives, 
4 32-item criterion test was developed. A multiple- 
choice format was adopted for this test although 
all but four test items required some computation. 
The majority of test items required the plotting of 
Points in polar coordinates, the drawing of vectors, 
the solution of vector diagrams, and the conver- 
sion of time, speed, and distance parameters. The 
incorrect answers provided for these test ques- 
i were designed to reflect common types of er- 

Two experimental versions of the 1-week 
Maneuvering board course were designed. The 
ae of these experimental courses (E-1) was 
ee to emphasize Gagné’s (1965) Type 3 
learning. It was oriented toward the rote memori- 
zation of fixed procedures for solving maneuver- 


ing board problems. The second experimental 
course was designed to reflect Gagné’s Type 7 
learning. It included, in addition to problem-soly- 
ing procedures, the principles, concepts, and ra- 
tionales underlying them. The E-2 course also 
involved use of a specially designed training de- 
vice (Tallmadge & Shearer, 1967) to provide vis- 
ual demonstrations of relevant relative motion 
principles. 

The subjects (Ss) for this study were 166 
Navy enlisted men enrolled at the Treasure Island 
RD/A School. Navy Basic Battery aptitude test 
scores were obtained from existing records for 
these Ss. This test battery is composed of a 
General Classification Test (GCT), an Arithmetic 
Test (ARI), a Mechanical Test (Mech), a Cleri- 
cal Test (Cler), and an Electronic Technician 
Selection Test (ETST). The Spatial Orientation 
and Spatial Visualization subtests of the Guilford- 
Zimmerman Aptitude Survey and the Kuder Pref- 
erence Record, Vocational Form B, were also ad- 
ministered to all Ss. The choice of these particular 
tests was based on a desire to cover all of the 
traditionally accepted mental abilities and to as- 
sess interests in a generic manner rather than in 
terms of specific vocations. A more complete de- 
scription of the rationale for test selection is pre- 
sented elsewhere (Tallmadge & Shearer, 1967). 

Three versions of the Maneuvering Board 
course were included in the study: (a) the stand- 
ard, or control, course; (b) H-1; and (c) E-2. 
Each version of the course was administered to 
an. approximately equal number of trainees. The 
same Navy-provided instructor taught all courses 
throughout the study. He was intensively trained 
in administering the two experimental courses, and 
his classroom performance was continually mon- 
itored by project personnel. 

Total classroom time was held constant for the 
three courses although substantial differences oc- 
curred in active teaching time. The E-1 course 
required the least teaching time and the E-2 course 
required the most. These differences were compen- 
sated for by providing supervised practice, review, 
and drill on problem-solving procedures at ap- 
propriate points throughout the course. At the 
end of the 1-week course, achievement was meas- 
ured by means of the criterion test. Trainees were 
given 2 hours to complete the test. 

Three alternate forms of the test were used 
and were administered in a random order to the 12 
school classes which participated in the study. Two 
scores were computed for each trainee: (a) MB 
seores = items correct; and (b) CMB scores = 
items correct minus one-third items incorrect. 
(Note: Although all Ss reached the last test item, 
a significant number of items were omitted, par- 
ticularly by the E-2 group.) 

It was intended that patterns of aptitudes and 
interests be examined. (One plausible hypothesis 
was that Ss more interested in clerical than scien- 
tific areas would do better in the E-1 course while 
Ss with the reverse interest pattern would do better 
in the E-2 course.) The number of possible combi- 
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nations of aptitude/interest patterns was very 
large, however, (120 possible pairs of measures) 
and a decision was therefore made to examine in- 
dividual measures first to look for promising leads. 

For the purpose of these single measure 
analyses, Ss were sorted into high and low groups 
on each of the 16 aptitude/interest measures. Sepa- 
rate two-by-three factorial design analyses of vari- 
ance were conducted for each of the 16 aptitude/ 
interest measures using each of the two dependent 
variables (MB scores and CMB scores). Because 
unequal cell frequencies resulted from the manner 
in which data were collected, the unweighted 
means technique (Winer, 1962) was used in these 
analyses, 


RESULTS 


A total of 16 unweighted means analyses 
of variance involving single aptitude/in- 
terest measures were computed for each of 
the two criterion measures, Since the re- 
sults obtained with these two criterion 
measures were highly similar, only the MB 
analyses are summarized here. Table 1 


TABLE 1 
Summary Sratistics FROM SIXTEEN Unweicurap 


8 ANALYsEs Or VARIANCE Using 
MB Crrrmrion Scorzs 


ry 
QcT 8.95%** | 10.75** | <1.00 
ARI 7.73*** | 19.09*** | <1.00 
Mech 8.18*** | <1.00 1.04 
roy 7.38°** 3.81* 1.14 
6.77** 8.34** 3 
Guilford-Zim- Sc 
GZ-V 9.85*** | 15.18*** | <1.09 
GZ-VI 11.76*** | 19.16*** | <1.00 
Kuder Vocational] 
ference 
K-M 7.38*** | 1.79 <1.00 
K-Comp 7.79°** | 1.01 <1.00 
K-Sei 7.12% 1.04 1.58 
K-Pers 7.87*** | 2.81% <1.00 
K-Art 7.88°** | <1.00 <1.00 
K-Lit 8.00*** | 1.57 <1.00 
K-Mus 7.62*** | <1.00 <1.00 
K-SocSer 7.89*** | <1.00 <1.00 
K-Cler 8.18*** | <1.00 1.71 
*p < 10. 
** > < .005. 
**p < 001. 


TABLE 2 
Mzans ANp STanparp Deviations or 
Acursvement Tzsr Scorns Foutow- 
inc Tuer Typms oF TRAINING 


—ossooa— 
Control E-t E2 
Criterion 
M SD M SD M SD 
_— | |_| 
MB 19.20 | 4.28 | 20.55 | 4.65 | 29.55 4.49 
CMB 14.99 | 5.69 | 16.82 | 6.16 19.49 | 5.8 


presents the ¥ ratios and associated p yal. 
ues resulting from the MB analyses fo 
both main effects and the interaction effect, 

All analyses showed differences among 
training methods which were significant st 
either the .005 or the .001 level. Mean 
and standard deviations of MB and OMB 
scores for each training method are pre: 
sented in Table 2. . 

Five of the aptitude measures (GCT, 
ARI, ETST, Spatial Orientation-GZ-V, 
and Spatial Visualization—-GZ-VI) were 
found to be significantly related to MB 
scores. These five aptitude measures wer 
also related to CMB scores, as was the 
Navy Basic Battery Clerical test (p <j 
None of the analyses showed significant 
interactions for aptitude/interest. by train 
ing method. The highest obtained F ratio 
was 1.71, whereas an F of 3.06 was Ie 
quired for significance at the .05 level (d/ 
= 2/140). 

Only four of the aptitude/interest meas 
ures produced interaction F ratios greatet 
than 1. Although these findings were no! 
promising, mean achievement scores withil 
cells for interaction of aptitude/interet 
level and training method were examin 
to determine whether patterns of scores 0 
these measures could be expected to prd 
duce significant and meaningful interac: 
tions. For the purposes of this examination, 
control group scores were discounted since 
the control training method was some kin 
of mixture of the two experimental meth- 
ods and possible interactions involving the 
control and one of the two experimen 
methods rather than the two experiment 
methods could not be meaningfully it 
terpreted. 
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+ Of all the possible combinations, that in- 
volving the Navy Basic Battery Clerical 
test and the Kuder Scientific interest scale 

. appeared to have the greatest potential for 

' producing a significant interaction. The 

achievement difference (CMB scores) be- 
tween the high and low clerical groups was 

_ 3,7 points within the H-1 training method 

and only 0.13 points within the E-2 train- 

' ing method, while scientific interest was 

negatively related to achievement for the 

E-1 method (—0.8 points) and positively 

| related to achievement for the E-2 method 
(+3.0 points). 

To assess this relationship statistically, 
the total sample of Ss was sorted into two 
_ aptitude/interest groups: (a) those whose 
/ clerical achievement scores were higher 
than their scientific interest scores and 

(b) those whose scientific interest scores 
were higher than their clerical aptitude 
scores. CMB achievement scores were then 
analyzed using the same analyses of vari- 
ance model previously employed for the 
single aptitude/interest measures. Again, 
the interaction effect was not statistically 
significant, 7 = 1.12, df = 2/157, although 
the training method main effect was sig- 
nificant, 7 = 6.24, df = 2/157, p < .005. 
One final attempt was made to find a 
Significant interaction between training 
methods and learner characteristics by em- 
ploying covariance analysis techniques. 
The Navy Basie Battery Arithmetic test 
and the Guilford-Zimmerman Spatial Vis- 
ualization test were selected as the most 
Promising covariates because they were the 
tests most highly related to MB and CMB 
achievement scores. The Kuder Scientific 
Interest scale was selected as the inde- 
Pendent variable because it showed the 
highest meaningful interaction F ratio in 
Previous analyses. (Kuder Clerical was re- 
jected because its high interaction F ratio 
Was caused by an uninterpretable devia- 
tion of the control group from the experi- 
Mental groups in terms of the pattern of 
achievement scores.) MB scores were used 
48 the dependent variable. 
The results of these analyses were also 
hegative for interaction of aptitude and 
Method, F = 2.44, df = 2/159 with the 


Guilford-Zimmerman as a covariate and 
F = 1.89, df = 2/159 with arithmetie con- 
trolled by covariance analysis, 


Discussion 


The main concern of the study reported 
here was an investigation of possible in- 
teractions existing between learner char- 
acteristics and methods of instruction. 
With respect to this issue, the findings of 
the research were negative. There were no 
significant interactions among the three 
training methods studied and the 16 
learner aptitude and interest measures. In 
view of other reported studies, these nega- 
tive results were surprising. 

Although the present study cannot be 
considered to provide any final answers to 
questions about training and individual 
differences, it is important to seek explana- 
tions for the negative findings. The pos- 
sibilities which appear most plausible are: 

1. The particular training methods em- 
ployed were responsible for the negative re- 
sults. Other training methods might inter- 
act with learner characteristics. 

2. Although the measured learner char- 
acteristics showed no interactions, other 
aptitude, interest, or personality factors 
might have. 

3. Other interactions existed, perhaps 
between the materials to be learned and the 
training methods employed, which acted in 
such a way as to obscure the interaction of 
interest here. 

Since this study was specifically designed 
to identify interactions between learner 
characteristics and instructional methods, 
and since it followed promising leads re- 
ported in other research, the first two of the 
above listed possibilities seemed less likely 
than the third. Although the study did not 
provide any direct support of the third 
alternative, its possibility was substanti- 
ated by a factor analysis (Tallmadge & 
Shearer, 1967) which showed the content 
of the Maneuvering Board course to be 
complex, involving manipulative skills, 
memory, and basic scientific background 
in approximately equal proportions. 

Support for the existence of interactions 
for subject matter by training method was 
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provided by the study discussed earlier 
(Bush et al., 1965). This type of interac- 
tion could easily explain their findings. 
Finally, the plausibility of relationships 
existing between subject matter and in- 
structional methods has led other investi- 
gators (e.g., Briggs, Campeau, Gagné, & 
May, 1965) to work extensively in this 
area, 

Although this study was not designed to 
investigate differences in the overall effec- 
tiveness of the training methods em- 
ployed, the research findings were inter- 
esting in this respect. The E-1 course was 
limited in content to coverage of those 
skills and knowledges covered by the final 
examination. It also provided substantially 
more time to drill and practice on these 
skills than the B-2 course. The E-2 course 
covered many topics not included in the 
criterion test, yet it produced criterion per- 
formance significantly superior to that 
produced by E-1 training. It was believed 
that this finding indicated that the E-2 
course produced a higher type of learning 
in terms of Gagné’s (1965) hierarchical 


structure and supported his contention th; 
higher types of learning are retained bett, 
than lower types. 
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ROLE OF SPECIFIC CURIOSITY IN SCHOOL ACHIEVEMENT 
Ny HY DAY? 


Ontario Institute for Studies in Education, University of Toronto 


In a series of 3 studies the importance of specific curiosity in 


school achievement for Grades 7, 8, and 9 pupils was examined. 


A test 


of specific curiosity was developed based on Berlyne’s definition of 
specific exploration. The results show that while school grades cor- 
related significantly with IQ scores they almost invariably failed to 
be related with the measure of specific curiosity. The measure of spe- 
cifie curiosity used was shown to be related to teachers’ ratings of 


curiosity. 


Educational systems are continually be- 
ing charged with neglecting the develop- 
ment of curiosity in their pupils, and with 
rewarding students for rote learning and a 
display of intelligence rather than for ex- 
tending their interests in the world about 
them. If this is so, then an examination of 
school grades should show that they are 
more closely related to IQ scores than to 
Scores on some test of curiosity. 

The concept of curiosity is of recent 
origin and has been poorly defined. Fowler 
(1965), in a recent review of this area, 

Suggests that curiosity is “a behavior with- 
out a definition [p, 23]” although he con- 
tinues to argue its existence and impor- 
tance. 

Maw and Maw conducted a series of 
studies on the evaluation of curiosity in the 
classroom (Maw & Maw, 1965). They de- 
fined curiosity as the need to extend one’s 
knowledge into novel, strange, and incon- 

gruous elements in the environment. Ber- 
lyne (1963b), on the other hand, restricted 


his definition of curiosity or exploratory- - 


drive to “a state of unrest and distress... 
[which] can be brought on by perceiving 
Something under unfavourable conditions, 
such that the small amount of informa- 
tion received from the object in question 
eaves considerable uncertainty regarding 
the objects characteristics [p. 302].” This 
form of curiosity was said to induce the 
organism to engage in specific exploration. 
os 

*The author wishes to express his gratitude to 
the principals and staff of the two junior high 
schools in the Toronto suburbs for their generous 
ee in making children and facilities avail- 
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curiosity in their pupils and to scores on the Barron-Welsh Art scale. 
Over an 11-month period pupils failed to increase their level of specific 


It appears, therefore, that one reason 
for the difficulty in studying curiosity is the 
failure to distinguish between two types of 
curiosity. Specific curiosity describes the 
aroused state of an organism when con- 
fronted by an ambiguous or unclear stim- 
ulus and which may result in specific ex- 
ploration; while diversive curiosity de- 
scribes a general condition which may be 
analogous to what Maw and Maw consider 
as the need to seek new experiences or to 
extend one’s knowledge into the unknown, 
and which may elicit what Berlyne has 
termed diversive exploration. 

Recently it was shown that measures 
of exploratory choice after brief initial ex- 
posures seem to describe an inverted U- 
shaped function over complexity, such that 
there is commonly a tendency to attend to 
the more complex alternatives at a low 
level of complexity, but to avoid the more 
complex alternatives at a relatively higher 
level of complexity (Berlyne, 1968a; Day, 
1965). Selective attention to more complex 
figures (Day, 1965) and duration of ex- 
ploration (Day, 1966) also seem to increase 
with complexity up to a peak and then 
drop off. Moreover, verbal evaluation of 
interestingness appears to follow a similar 
function (Berlyne, 1968a; Day, 1965, 
1967b). It seems, therefore, that the verbal 
response of “interesting” may indeed re- 
flect approach behavior to complex stimu- 
lation and an intent to indulge in forma- 
tion-seeking or drive-reducing exploration 
when in the presence of a specific com- 
plex stimulation. 

It was proposed, therefore, to develop a 
test of specific curiosity which would 
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measure the intent of the student to ap- 
proach high levels of visual complexity and 
to withdraw from simple visual stimula- 
tion. 

Using a series of 28 figures generated by 
Berlyne (1963a) to examine various ex- 
ploratory and evaluative responses to per- 
ceptual complexity, Day had _ student 
teachers rank them along a continuum of 
complexity (Day, 1965). Concordance 
among Ss was found to be almost perfect 
(W = 91, p < 001). Following this, other 
students ranked the same figures along a 
dimension of ‘“interestingness.” Results 
showed that Ss generally tended to evalu- 
ate figures at the intermediate level of 
complexity as most interesting. 

A Kendall test of concordance showed 
strong agreement among Ss in their rank- 
ing of these figures (W = .382, p < .001). 
Yet large individual differences in shapes 
of the curves were found, and in the level 
of complexity at which their interest 
peaked, Vitz recently also showed that al- 
though the average shape of a preference 
function is an inverted U, this average is 
derived from a larger number of functions 
varying widely in shape (Vitz, 1966). 

It was therefore postulated that if in- 
dividual differences in preference for com- 
plex visual stimulation do, in fact, exist 
among school pupils, the test of specific 
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curiosity should measure these differenca 
Moreover, if the charges of neglect, nn 
veloping curiosity in schools today 
valid, scores on this test would fail to oi 
relate with school grades. 

Tn a recent study, Penney and McQ 
(1964) found that reactive curiosity seo 
did not correlate with IQ scores in a gr 
of 433 children in Grades 4, 5, and 6, 
though reactive curiosity as defined 
these authors appears to incorporate bo 
specifie and diversive features, it was 
that interest in complex stimulation sho 
not of necessity be related to the ability 
integrate complex stimulation into 
cognitive system (certainly a feature of in 
telligent behavior), at least in children, 


EXPERIMENT I | 


Method 


Subjects and procedure. One hundred and thit) 
teen pupils in Grades 7 and 8 at a junior high 
school in North York, Ontario, participated in the 
experiment. The test of specific curiosity Gal 
consisted of responses to the 28 figures projected 
on a screen in random order to groups of student 
in their classrooms, 62 subjects (Ss) therefore bee| 
ing presented with the figures in one randomised 
order while 51 Ss saw them in the reversed orden 
Each figure was displayed on a screen for a & 
second interval followed by a second 5-second 
interval of no-presentation to allow time for mark 
ing the answer sheets. The Ss were instructed 
evaluate their degree of interest in each figure oll 


COMPLEXITY RANKS 
Fa. 1. Distribution of mean interestingness rankings of 28 figures ranked for complexity- 
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a 7-point scale. The Ss were reassured that the in- 
formation would not be made available to the 
school and that the purpose of the study was to 
evaluate the figures. Each session lasted no more 


than 15 minutes. 


Results 


Table 1 shows the results of an analysis 
of variance of the interestingness ratings. 
Order of presentation was not a significant 
variable but there were great differences in 
interest in the 28 stimulus figures. A greater 
portion of the total error variability ap- 
pears to have been contributed by the be- 
tween Ss variability. 

IQ scores measured on the Dominion 
Group Test of Learning Capacity-Inter- 
mediate and end-of-term examination 


TABLE 1 
Summary or ANALYSIS OF VARIANCE OF 
InreREstTiINeNEss Data: 
ExpErimMent I 


Source af US F 

Between Ss 112 16.35 

Order 1 6.05 | <1 

Error (b) lll 16.45, 9.73* 
Within Ss 3051 3.43 

Material 27 195.05 | 115.41* 

OxM 27 4.65 2.75* 

Error (w) 2997 1.69 

*p < 001. 
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TABLE 2 


CorReLaTIon CoEFFICIENTS oF 112 
GravzE 7 anp 8 SrupunTs 


Subjects ae 10 

IQ -01 = 
English literature —.01 -67* 
English (language) —.02 -64* 
History 0 -61* 
Spelling -O1 -59* 
Math —.02 .53* 
Science — .08 -53* 
Industrial arts or 

home economics —.01 -46* 
Music 02 49% 
Geography 0 41* 
Penmanship 0 Al* 
Art 0 -30* 

*p < 01. 


marks in all subjects were available for 
112 pupils. 

A TSC score for each S was derived by 
summing the differences between the rat- - 
ings of the eight most complex, and the 
negative differences of the six least com- 
plex figures, from the mean of that $’s rat- 
ings for all 28 figures. Tests of correlation 
of these scores with IQ scores and grade 
marks yielded the coefficients listed in Ta- 
ble 2. As predicted, TSC scores failed to 
correlate with IQ scores and with any of 
the school grades, while IQ scores corre- 
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COMPLEXITY RANKS 


Fic. 2. Distributions of mean interestingness ratin 


gs of 28 figures ranked for complexity. 
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lated significantly with every one of the 
grade marks. 

Figure 2 shows the distribution of mean 
interestingness ratings for the 28 figures 
along the dimension of complexity. The 
similarity of this distribution to the ear- 
lier distribution of ranking scores by adults 
(Figure 1) should be noted. No differences 
were found in specific curiosity scores be- 
tween male and female students (¢ = .71). 


Discussion 


Interest in visual complexity appears to 
be unrelated to intelligence as measured by 
the Dominion Group Test and to grade 
scores, at least in this particular school. 
However, any generalization from these 
conclusions must be predicated on the find- 
ings that the test used is valid against 
other criteria of curiosity. 

It was decided, therefore, to conduct a 
second study in another school in which 
scores on the TSC would be compared to 
estimations of curiosity level by the stu- 
dents’ teachers, along the line of work done 
by Maw and Maw (1965). 


Exparment II 


Method 


Subjects and procedure. The Ss in this study 
were 247 Grade 7 and 8 pupils in a junior high 
school in Lakeshore, Ontario, school district. 

The procedure was altered to include the pres- 
entation of the Barron-Welsh-Art-Scale (BWAS), 
a measure of preference for complex. i- 
cal figures (Welsh, 1959). The BWAS is presented 
in a series of 62 slides and each § records whether 
he likes or dislikes each pattern, Barron has found 
that preference for complexity, as measured by 
this scale, is related to personality characteristics 
of creativity, flexibility, independence of judgment 
and breadth of interest (Barron, 1963). 

Since order of presentation of the TSC had not 
been found to be significant in the first study, all 
eight classes were presented with the same order 
of the TSC, but four of the classes judged the 
TSC before the BWAS while the other four classes 
judged them in the reverse order. 

The eight home-room teachers were then asked 
to rank the pupils in their class according to 
specific curiosity. In the instructions given to each 
teacher, specific curiosity was defined as follows: 

For the purpose of this study, a school child 
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is said to exhibit curiosity to the extent 
he: 
1. Shows an interest in new, ‘incongruous, 


complex objects and events in his en 
ment. 


jects which are not necessarily assigned. 

8. Examines, explores, and/or — handle, 
studies, asks questions about, discusses some 
topic raised in class or an object brought to 
school by one of his classmates or the teacher, 

4. Persists in such examinations, explore 
tions, and/or manipulations (for example, he 
keeps studying about the topic or object um 
til he understands it more fully). 

This definition of specific curiosity was modeled 
on that of Maw and Maw (1965), but care wat 
taken to extract and develop only the: specifid 
elements from their more inclusive definition. 

The instructions to the teachers also suggested 
that the teacher list the most curious on top d 
the list, then the least curious on the bottom 
the list, the second most curious, etc., 80 ths 
the middle of the ranking would probably contaia 
the most uncertain and ambiguous decisions. 


Results 


An analysis of variance revealed no oF 
der effects but did yield significant dif 
ferences in Ss’ interest in the 28 figures 
Again, between-S variability was signif 
cantly greater than within-S variability. 
These results are presented in Table 3. 

The means of the rankings of the 28 fig 
ures here compared with the means for the 
first group and a Kendall rank correlatio 
coefficient showed that the two group 
evaluated the figures similarly (7 = 
P < .001). This can be seen in an exall 
ination of Figure 2 which compares te 
distributions of the means of the interest 
ingness ratings. ' 

An analysis of variance of the two sets © 
ratings, however, showed a significant 4 
ference in interest in complexity betwe? 
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‘the two schools. This is clearly demon- 
strated in Table 4. Moreover, the Schools x 
Material interaction was significant, indi- 
‘cating that the differences were in the slopes 
‘of the distributions. 
An examination of Figure 2 shows that 
‘the differences occur mainly in the ex- 
tremes of the distribution, outside of the 
two vertical lines, and including the data 
which are used to establish the specific 
guriosity scores. A ¢ test of differences in 
curiosity scores yielded highly significant 
results (t’= 6.49; p < .001), showing that 
‘the pupils in the first school had a generally 
higher level of specific curiosity. 

Again, sex differences were not signifi- 
cant (¢ = .081). 

Since most of the classes numbered 30- 
35 pupils it was decided to compare scores 
of the pupils ranked by the teacher in 

the top quarter of the class (eight Ss) with 
those in the bottom quarter. Results of a 
t test of differences was highly significant 
(t = 12.6; p < .001), indicating that the 
“TSC was in substantial agreement with 
the teachers’ evaluations of curiosity in 
their pupils. 

TSC scores correlated significantly with 
BWAS scores (r = .22; p < .01), suggest- 
ing that high curiosity Ss tended to be 

flexible and independent of judgment. In- 
-tercorrelations with school marks are pre- 
sented in Table 5. They showed that some 
of the grades in this school did correlate 
significantly with TSC scores. Of interest 
is the finding that, of the school grades 
which did correlate significantly, some ap- 


TABLE 3 
Summary or ANALYSIS OF VARIANCE OF 
NGNESS DaTA: 
Experment IL 


TABLE 4 


Anatysis OF VARIANCE OF INTERESTINGNESS 
Data BETWEEN THE Two ScHOOLS 


Source af MSj 

Between Ss 359 561.91 
Sel 1 544.36 
Error (b) 358 17.55 

Within Ss 9720 4.01 
Material 27 $91.33 
SxXM 7 228.83 
Error(w) 9666 2.30 
*p < .001. 


pear to lack any relationship with curiosity 
(e.g., penmanship). 

IQ scores were not available at this 
school, so a correlation of specific curios- 
ity with intelligence was not obtained. 


Discussion 


The results of these two studies point 
clearly to the existence of a dimension of 
specific curiosity which can be defined as 
an “interest in complex stimulation.” The 
test developed to measure this dimension 
appears to discriminate levels of specific 
curiosity, at least among adolescents be- 
tween the ages of 12 and 16. Moreover, 
scores on this test were substantially in 
agreement with teachers’ ratings of their 
pupils along this dimension, and corre- 
lated significantly with Barron’s descrip- 
tion of individuals varying along a trait of 
flexibility and breadth of interest. The low 
correlation may be due to the fact that the 
BWAS is a test for adults who may be 
more discriminating in their preference for 
visual displays than are adolescents. 

Scores for the two schools are signifi- 
cantly discrepant to stimulate a search for 
the reasons. Among the possible causes for 


lies). However, the scope of this study did 
not include an examination of the inter- 
school differences in curiosity. 


TABLE 5 
CorreLation CoErFicienTs oF SPECIFIC 
Cuntosrry AND Scuoon Grapxs oF 247 
Grave 7 anp 8 STupENnTS 


Subjects r 


English literature 029 
English composition -224** 
English grammar —.001 
History 026 
Mathematics 030 
Physical education -009 
Geography 060 
Science 172* 
Penmanship -170* 
Music -158* 
Spelling .428** 
Industrial arts or home economics -239°* 
Barron-Welsh Art Scale -220** 
*p < 05. 
*p < Ol. 
Experment IIT 

Method 


Subjects and procedure. The Ss in this study 
were 429 Grade 7, 8, and 9 pupils in the same 
junior high school as in Experiment I. 

The TSC was presented in the school auditorium 
with a complete grade participating in a session. 
The BWAS was also presented at the same ses- 
sions. 


Results 


Of the 113 pupils who had taken the test 
in the previous sample, only 61 were pres- 
ent for the retest. Test-retest correlation 
was .48, significant at the 1% level. A t¢ test 
failed to show any significant shift in 
specific curiosity’scores over the 11-month 
period (¢ = .66). 

The distributions of mean ratings of in- 
terest-in-complexity scores and TSC scores 
were not significantly different from those 
in Experiment I. 

Correlation coefficients again showed 
that IQ scores correlated significantly 
with school grades (except penmanship) 
but not with TSC scores (r = .01). On the 
other hand TSC scores correlated signifi- 
cantly only with English composition, pen- 
manship, and music (p < .05). TSC scores 
again correlated with BWAS scores ce 
14, N = 428, p < .01). Table 6 sum- 
marizes these correlation coefficients. 


Hy Day 


Unlike Penney and McCann's (1954 
findings, there were no significant sex di 
ferences in curiosity. However, this js ; 
line with Maw and Maw (1964), who 
ported no consistent sex differences in 
selection of unbalanced and/or unfamilj 
symbols and figures. 


Discussion 


The importance of specific curiosity as; 
dimension in a student’s behavior n 
hardly be argued. A high level of speci 
curiosity indicates an interest in approach: 
ing and exploring high levels of novelty 
complexity, incongruity, etc. Certain} 
such a characteristic mode of behavio 
should lead to the development of an in 
dividual who seeks to learn and to develop 
Barron (1963) and Golan (1962) show 
that preference for asymmetry and com: 
plexity in visual patterns is linked wi 
high creativity and flexibility. If one 
of education is the development of cres 
tive individuals, then the educational 
tem should inspire and reward curiosity. — 

However, results in this series of expert 
ments indicate the failure of the preset 
educational system to develop interest it 
complexity. This is demonstrated in tht 
failure of students to improve scores on 
test of specific curiosity over 11 mont 
and in the failure to find a consistent 00! 


TABLE 6 
Correnation Coprriciunts oF SpEctric 
Curtosity anp IQ ScorEs OF 
Grave 7, 8, AND 9 STUDENTS 


Specific N 

Subjects pede N 10 
1 CSS ein an pe er 
BC —.01 395 pel 
Eeelinh literature | Gr] ae | 
Zaslish (composition) oe | 497 a. | in 
Penmanship ‘ge | 300 | 08, | 
History 208 aos | ages | it 
phy -00 429 | 378 | fy 
ematics —.06 4 Bt rit 
Frend ‘be ag Oe 4 
Art ‘| 3a] cae | 
Industrial arts or home o | oe 
economics | BB] ee | a 
ie Lo | oo | ee 

Barron-Welsh Art Scale ligee | 428 | 02 
(SG ee eal De pee 


Py 
P< .05. 
“p< OL 


i 


Specrsic Curiosity in Schoo, AcHIEVEMENT 


yelation between school grades and TSC 


scores. Examination of the school grade- 
curiosity correlations shows that no subject 
scores are consistently correlated with 
TSC scores although a few of them (com- 
position, penmanship, and music) did 
correlate significantly in two of three stud- 
ies? It is interesting to speculate on the 
reasons for these correlations since these 
three subjects emphasize orderliness, sym- 
metry, and aesthetic qualities, but any 
answers to this question would require 
additional studies directly concerned with 
an understanding of curriculum content in 
the various subjects. 

On the other hand, intelligence seems to 
be an important correlate of school 
achievement accounting for a fair propor- 
tion of variance, This concurs with the 
commonly accepted notion that achieve- 
ment in the classroom requires intelligent 
behavior but little or no interest in ex- 
panding one’s knowledge into areas pe- 
ripherally or indirectly connected with cur- 
Ticulum. 

Further work is presently in progress 
which will relate the concept of specific 
curiosity to that of anxiety and to other 
personality characteristics in adolescents 
(Day, 1966). A series of studies has also 
been undertaken with the goal of examining 
the methods of manipulating the level of 
specific curiosity (Day, 1967a; Day & 
Thomas, 1967; Sobol & Day, 1967). Results 
to date emphasize the importance of main- 
taining arousal level at an optimum level in 
order to achieve maximum curiosity. 
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STUDENT PERSONALITY CORRELATES OF 


TEACHER RATINGS 
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The 227 students taught by 3 instructors of a course in educational 
psychology completed an instructor/course rating form and a person- 
ality test. The ratings were factored and 9 factors were rotated by the 
normalized varimax technique. Factor scores were computed for each 
student. Correlations between the Personality test and factor scores 


were computed separately for each of the 3 instructor samples. Dif- 
ferences between correlations were tested for significance among the 
3 instructors. The results indicate that generally different personality 
Scores correlate with a given factor Score, and in some instances the 
same personality variable correlates with a given factor score in oppo- 
site directions from one instructor to another. It was concluded that 
the factors have a different psychological meaning for each instructor. 
The results were discussed in terms of the known differences among 


the instructors. 


Of the many studies dealing with teacher 
ratings, only a few have considered the per- 
sonality correlates of these evaluations 
(Getzels & Jackson, 1963; McKeachie, 
1963; Remmers, 1963). Further, of these 
few studies most have been concerned with 
correlations between ratings and the per- 
sonality characteristics of the teachers, The 
results are consistent enough to indicate 
that ratings are correlated with measured 
aspects, of an instructor’s personality ob- 
tained by self report and peer nomination 
techniques (Corcoran, 1961; Isaacson, 
McKeachie, & Milholland, 1963; Veldman 
& Peck, 1963). Evaluations of instructors, 
then, appear to reflect something more 
than:a student’s personal reactions. 

An equally important but somewhat 
neglected aspect of teacher rating is their 
relationship to the Personality character- 
isties of those doing the rating (Rezler, 
1965). That individuals with different 
education values describe different traits 
as being generally important for teachers 
is not surprising (Kerlinger, 1966) ; any 
evaluative decision or choice must be 
viewed in terms of a preexisting frame of 
teference from which the value or weight 
of a given dimension is derived. It is con- 
tended that certain measured personality 
characteristics of students reflect to some 
extent these preexisting frames of Tefer- 
ence. These frames of reference are not 
merely something brought to the classroom 


by the student, but rather they are part of 
the total context in which the instructor’ 
behavior occurs. As part of the context, 
even if only an implicit part of the back- 
ground, they contribute to the meaning of 
an instructor’s behavior or teaching. 

Where Rezler (1965) studied student | 
personality correlates of teacher ratings, 
ignoring instructor differences, the purpose 
of the present study is to show that stu- 
dent personality correlates of teacher rat- 
ings vary markedly from one instructor 
to another when students of similar aver- 
age personality characteristics rate these 
instructors with the “same” evaluation 
form. 


Mernop 


Instructor Sample 


Three instructors of a course in introductory 
educational psychology were rated on sevel 
scales by students enrolled in this course during 
the fall and spring semesters of 1965-1966. The 
course is essentially a lecture course. All sections 
use &@ common textbook, and one half of each \y 
examination (covering the textbook) is common. 
Each instructor is free to develop the lecture 
aspect of the course in whatever way he deems 
appropriate. Instructor A has chosen to emphi- 
size the classroom application of psychological 
facts, and he also makes greater use of audio- 
Visual aids than Instructors B and C. Instructor 3B 
emphasizes the psychological aspect of the subject 
Matter with some explicit attempt to develop the 
educational implications of experimental results 
in psychology and education. Instructor C empha 
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sizes a humanistic psychology focusing on topics 
{ of relevance to education but with little explicit 
} development of classroom implications. 
The three instructors may be ranked in terms 
of an applied-theoretical course emphasis with A 
| being most applied and C most theoretical. Since 
both B and C offer primarily a lecture course, 
their style of teaching is more similar to each other 
than to A, although in terms of the purpose of the 
course offered, A and B are more similar. The cur- 
rent interests of the three instructors are also 
somewhat diverse. Instructor A, by training the in- 
terest, is oriented toward guidance and counseling, 
Instructor B is interested in experimental and 
methodological procedures, and Instructor C is 
interested in certain philosophical aspects of psy- 
chology. To some extent these interests are 
| probably expressed in the way each teaches the 
course. 


Student Sample 


Both the evaluation form and a personality test 
were completed by 227 students. This represents 
approximately 70% of the total course enrollment. 
There are 50 men and 177 women in the sample, 
most being in their junior or senior year, Since 
for each instructor the patterns of correlations 
among ratings and between ratings and personality 
test scores are similar for men and women, as well 
as for both semesters, these data have been com- 
bined for each of the three teachers. Instructor A, 

. who taught two sections of the course, was rated 
by a total of 39 students, Instructor B, who taught 
four sections of the course, was rated by a total 

of 120 students, and Instructor C, who taught 
three sections, was rated by a total of 68 students 

Who also completed the personality test. The aver- 
age scores of students taught by the three instruc- 

tors do not differ significantly on any of the 14 
Personality measures nor are there significant dif- 
ferences among instructors on each of the nine 
factor scores. 


“4 


- Measures 


_The evaluation form contains 21 items dealing 
With teacher behavior and 12 focusing on aspects 
of the course. These items were presented as seven- 
Point rating scales and were adapted from pub- 
lished and unpublished forms, or were written by 

© present investigators. 

Thirty-two of the items from the evaluation 
form were factored by means of an incomplete 
Components analysis using ones in the diagonal of 
the intercorrelation matrix. This analysis was per- 
‘ormed on the zero order correlations derived from 
the total sample of 227 students. Nine factors were 
Selected for rotation by the normalized yarimax 

que, and factor scores were computed for 


2 


fexiy, tem dealing with the adequacy of the 
xtbook was omitted because different texts were 
d from one semester to the other. 


TABLE 1 


Factor Structure or Tan EvaLvATION Foru FOR 
227 Sropents (Loapines or .40 on Greater) 


Factor Loading 
Factor I: Confident, fluent delivery 
Peder and confdens “74 
ipeaks c! and distinctly 71 
Sone fluently and without hesitation «1 
from sate Saoreires «57 
Factor II: Clarity of course at 
pitas: of course i +883 
Urse Wi meaningful sequence +82 
josignment and reasonable ‘ie +48 
Explanations clear] M4 
pie ee Open, sympathetic attitude toward 
Willingness to help students 13 
Students feel free to express own opinions +70 
Sympathetic attitude toward students 70 
Respectful of views other than own 64 
Open to comments and questions +61 
Sense of proportion and humor 60 
Permissive and flexible i 
Free m annoying * 
Factor stimulating teach: 
Stimulates a intalions eel canteen Ba 
material teresting way . 
Relates subject to other fields 70 
hearty ability 67 
Factor V; Fairness of evaluation 
Content of exams appropriate 82 
Fairness of grades 80 
Factor VI: Suitable class material and value of 
Contribution of course to general education 79 
Value of course for teacher preparat By} 
No overlap with other courses taken 4 
Faces Vins Eeeenenee of ereinsea i 
: ua 
Frequency a papers uate Ho) 
Freq exams 8) : 
Factor Vi Interest Pavlos of subject , 
Knowl Sates) matter ‘ tf) 
Knowledge of new developments in field “68 
cae 1x of 68 
Uses appropriate Soohnlgon of instruction Bet 
ol os 


each student on the nine factors. The items on the 
evaluation form which loaded .40 or higher on 
the nine factors are presented in Table 1. key 
In ‘addition to the above listed items and the 
resulting factor scores, semester grade (Gr) in 
the course was another variable. Further, all stu- 
dents were given the Omnibus Personality In- 
ventory (OPI), Form F (Center for the Study of 
Higher Education, 1962). The OPI was derived 
explicitly for use with college students. It yields 
14 measures of relevance to an academic context, 
Brief scale descriptions follow. 
Thinking Introversion (TI). Liking for reflec- 
tive thought and scholarly activities. one 
Theoretical Orientation (TO). Interest in sci- 
ence and scientific activities. ie hea 
Estheticism (Es). Diverse interests in artistic 
matters and activities. i mete 
Complexity (Co). Tolerance for ambiguities 
and uncertainties, a fondness for novel situations 


and ideas. eos Bite 
Autonomy (Au). Nonauthoritarianism and a 


need for independence. 
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Religious Orientation (RO). Skeptical of ortho- 
dox religious beliefs and practices. 

Social. Extroversion (SE). Strong interest in 
people and being with them. 

Impulse Expression (IE). Readiness to express 
impulses and to seek gratification in thought and 
action. 

Personal Integration (PI). Admits to few atti- 
tudes and behaviors that characterize socially 
alienated persons. 

Anxiety Level (AL). High scorers deny feelings 
or symptoms of anxiety, and do not admit to 
being nervous or worried. 

Altruism (Am), Strong concerns for the wel- 
fare and feelings of people. 

Practical Outlook (PO). Evaluates ideas and 
things in terms of immediate utility; values ma- 
terial possessions and concrete accomplishments. 

Masculinity-Femininity (MF). High scorers 
(masculine) deny interests in esthetic matters, 
admit to few adjustment problems, express an 
interest in scientific matters. 

Response Bias (RB). High scorers are respond- 
ing to this measure in a manner similar to a group 
of students who were explicitly asked to make a 
good impression on the test. 

A 47 by 47 variable (14 OPI, 32 ratings, grade) 
correlation matrix was calculated separately for 
the students of each instructor. Some of these cor- 
relations will be presented below. In addition, the 
nine factor scores were correlated with the OPI 
scores and grades for the students of each instruc- 
tor and form the major basis of the present study. 


Rasuuts 
Teacher Factors 


The data presented below show the per- 
sonality and grade correlations with the 


TABLE 2 
CorrE.ates or Factor I, Conripent,® Fivzn? 
Dutivery, Sieniricantiy DirFErent FROM 
ZERO WITHIN AND/OR Steniricanriy Drr- 
FERENT AMONG INSTRUCTOR 


Sampius 
Instructor 
Correlates : arid 
A B Cc correlations* 
Es —.19 | —.20° f 
PI ‘ae | “ibe | 38" | AB versus c 
AL -32 18 | —05 
ee ee ees 
RB tase | og) | 20g | A - 
Gr +06 :01 -33°* | B versus C 


Note.—For Instructor A, N = 30; Instructor \. 
instructor C, N= 68, 7) Se for BiN 30; 
Differences 


significant 
mn based on the sample for lnsiemsiee Ot 
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scores for the five factors pertaining t 
teacher behavior. Not only are there no. 
ticeable differences in the direction of the 
correlations from one instructor to an. 
other, but there are several significant dif. 
ferences between correlations. . 

Although the primary purpose of the 
present study is focused on differences jn 
correlations from instructor to instructor, 
it is also instructive to note the types of 
student who are differentially appealed to 
by these instructors. Many of the results to 
be presented can be understood in the light 
of what was said previously about each in- 
structor. 

The significant personality correlates of 
Factor I, confident, fluent delivery, for the 
three instructors are presented in Table 2. 

As may be noted, there are significant 
differences in the personality test correla- 
tions among instructor samples on the 
measures of esthetic interests, masculine: | 
feminine interests, and semester grade. As 
indicated by the significant correlates of | 
this factor for each instructor, a different | 


type of student rates each instructor high 
on confident, fluent delivery of lectures. 
Specifically, the significant correlates for 
Instructor A are Response Bias (.43), Per- 
sonal Integration (.35), Masculinity (.33), 
and Anxiety Level (.32). Thus, students 
rating Instructor A high on this factor 
may be characterized as having a positive 
self-regard and masculine interests. The 
significant correlates for Instructor B are 
Practical Outlook (.20), Estheticism (—.20), 
and Masculinity-Femininity (.18), thus 
characterizing students who rate Instructot 
B high as being practical, nonesthetic, and 
Masculine in interests. On the basis of the 
Significant correlates of this factor for In- 
structor C, students giving this instructot 
a high rating may be characterized as es 
thetically oriented (Es 29) with a high 
level of achievement in the course (Gr .33)- 

The personality measures related 10 
Factor III, open, sympathetic attitude 
toward students, for the three instructors 
are presented in Table 3. 

There are significant differences in C0! 
relations among the three instructors on the 
Measures of Estheticism, Social Extrove™ 
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| 

, and Anxiety Level. The significant 
correlates of this factor for Instructor A, 
Anxiety Level (.35), Personal Integration 
(34), Response Bias (.34), and Altruism 
(32), indicate that the type of student who 
perceives this instructor as open and sym- 
pathetic is the student with a positive self- 
| regard and a concern for the welfare of 
others. The practically oriented (PO .19) 
student tends to rate Instructor B high, and 
the esthetically oriented (Es .29) student 
tends to rate Instructor C high on this fac- 
tor. 

The significant personality correlates of 
Factor IV, interesting, stimulating teacher, 
for the three instructors are to be found in 
Table 4. 

“ Nine of these 10 variables show signifi- 
cant differences in correlations among the 
three instructors. With respect to the type 
of student rating each instructor high on 
this factor, the correlates for Instructor A, 
Personal Integration (.44), Response Bias 
(48), Complexity (—.37), and Anxiety 
Level (.36), indicate that students with a 
positive self-regard and who like well- 
structured, unambiguous situations rate 

him high. The significant correlates for In- 

structor B, Anxiety Level (.28), Religious 

Orientation (—.23), Masculinity-Feminin- 

ity (18), and Response Bias (.18) suggest 

the characterization of a student with a pos- 
itive self-regard, with strong religious be- 


TABLE 3 
Connmuamas or Factor III, Opn, SYMPATHETIC, 
) SIGNIFICANTLY DIFFERENT FROM ZERO WITHIN 
AND/oR SiGNIFICANTLY DIFFERENT AMONG 
Instructor SAMPLES 


Instructor i 
Correlates Seat 
A B c correlations® 
-_--—| 

ta —.07 -.2 -29° B versus C 
Pr +31 —.12 —.06 A versus B 

Al 34" -10 06 
re +35 12 —.08 A versus C 

fe ae | ca | 

RB sae | lit | 100 


Note — = 
foe Tuya rete N = 39; for Instructor B, N = 120; 
.. Differences between Z 


is 
| Sample for tat 
aPS sepia cr 


TABLE 4 
Corretates or Facror IV, Inrerestine, Stim- 
ULATING TEACHER, SIGNIFICANTLY DIFFERENT 
FROM ZERO WITHIN AND/OR SIGNIFICANTLY 
DIFFERENT AMONG INSTRUC- 
TOR SAMPLES 


Tnstructor 


ecient Pre a ea 
& B c correlations® 
Es 23 =.18: .87** | AB Cc 
a aes —.09 -23* | A versus C 
RO cto | Sizgee | ion |? tem’ 
PI 440 14 06 A versus C 
AL »36* 23%" | —.07 AB versus C 
PO +26 16 =< AB versus C 
MF 24 -18* 14 B versus C 
RB 438° -18* 03 A versus C 
Gr —.00 -2 -41** | AB versus C 
Note.—For Instructor A, N = 89; for Instructor B, N = 120; 
for Instructor C, N = 68. 
4 Differences between Z transformed correlations exceed 
twice the standard error of the difference. For example, with 
to Es, the correlation based on the sample for Instructor 
A and for each Bi nt ferent from the 
re ae on the sample for Instructor C. 
p< 01 


liefs, and masculine interests as rating him 
high on Factor IV. The three correlates for 
Instructor C, Grades (.41), Estheticism 
(.37), and Complexity (.23), indicate that 
students perceiving Instructor © as an in- 
teresting stimulating teacher may be char- 
acterized as being high achievers in the 
course, as having esthetic interests, and as 
liking ambiguous, unstructured situations 
as well as novel ideas. The fact that Com- 
plexity correlates significantly with this 
factor for both Instructors A and C but 
in opposite directions is worthy of note be- 
cause it points clearly to the very thesis of 
the present paper; namely, the psychologi- 
cal meaning of each factor varies across 1n- 
structors. 

The personality measures related to Fac- 
tor VIII, interest in and knowledge of sub- 
ject matter, for the three instructors are 
presented in Table 5. ; 

‘As indicated, seven of the personality 
measures have significantly different cor- 
relations with this factor across the three 
instructors. The measures of Personal In- 
tegration (.37), Response Bias (.37), Com- 
plexity (—.87), and Anxiety Level (.35), 
indicate that the type of student rating 
Instructor A high on this factor may be 
characterized as having a positive self- 
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TABLE 5 
CorreLates or Factor VIII, INTREST IN AND 
Knownepen or Sussecr Marre, SIcNirr- 
_ CANTLY DIFFERENT FROM ZERO WITHIN AND/OR 
StanrricantLy DirrERENT AMONG 
Instructor SAMPLES 


Instructor 


Significantly 
Correlates mitene 
A B Coal acai 
TI 19 —. 19° 15 B versus C 
Es —.21 —15 19 B versus C 
Co —.37* —17 14 AB versus C 
Au 24 —-16 15 B versus C 
RO —.00 —.22* 03 
SE 29 -.1 02 A versus B 
PL .37* 10 09 
AL +35 -18* | —.04 A versus C 
PO 28 «22° —.16 AB versus C 
RB 37° +09 10 


Note.—For Instructor A, N = 39; for Instructor B, N = 120; 
for Instructor C, N = 68. ih fi tf t 
® Differences 


twice the standard of the For example, with 
to Co, the correlation based on the le for 
A and f neuen B is ferent from the 


regard, and as liking structured, unambig- 
uous situations. The correlates for Instruc- 
tor B, Practical Outlook (.22), Religious 
Orientation (—.22), Thinking Introversion 
(—.19), and Anxiety Level (.18), indicate 
that students rating this instructor high 
tend to be practical, religious, nontheoreti- 
cal, and nonanxious. None of the person- 
ality correlates for Instructor C are signifi- 
cantly different from zero. 

The significant personality correlates of 
the final teacher dimension, Factor Ix, 
preparedness of lectures, for the three in- 
structors are to be found in Table 6. 

Five personality measures show signifi- 
cantly different correlations with this fac- 
tor among the three instructors. This is the 
first factor encountered in which the meas- 
ures of personal, social adjustment (posi- 
tive self-regard) do not correlate for In- 
structor A. Rather, the type of student 
rating him high has masculine interests (MF 
33), likes. well-structured, unambiguous 
situations (Co —.33), and does not haye 
esthetic interests (Es —.32). The one per- 
sonality measure correlated with this fac- 
tor for Instructor B indicates that the prac- 
tically oriented student (PO .18) is 
likely to perceive him as having well-pre- 
pared lectures. The personality correlates 
for Instructor C allow a familiar character- 
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high achiever in the course (Gr .36), 
esthetic (Es .28) and feminine (MF —.3) 
interests. j 


Course Factors 


The personality measures related to the} 
four factors describing aspects of the course 
will now be presented. The student person. 
ality correlates of Factor II, clarity of| 
course, for the three instructors are shown 
in Table 7. 

There are more significantly differeni) 
personality correlates (10) of this factor 
across the three instructors than for any 
other course or teacher dimension. Further, 
for Instructor A, this is the factor with the] 
greatest number of significant correlates, 
The six personality correlates for Instructor’ 
A are Personal Integration (.50), Anx: 
iety Level (.48), Response Bias (.45), Im-| 
pulse Expression (—.39), Social Extrover- 
sion (37), and Complexity (—.32), 
Hence, the characterization of the student 
rating his course high on this factor is one 
having a positive self-regard, as being non- 
impulsive, socially extroverted, and pre 
ferring well-structured, unambiguous situa- | 
tions. The three correlates of this factor 
for Instructor B indicate that students rat- 
ing his course high in clarity are religiously | 
oriented (RO —.24), nonanxious (AL .28), 


| 


TABLE 6 
Corretates or Factor IX, PrmrarepNess 0? y 
Lecturzs, Stanrricantuy DirFeRENT FROM 
ZERO WITHIN AND/oR SIGNIFICANTLY 
Dirrerent amone INnsTRuC- 
ToR SaMPLEs 


Instructor ignificantly 
Sigpicent 
correlations 


AB versus C 
A versus C, 
AB vers' C 


—.09 ‘00 4B versus C 


120} 
Note.—For Instructor A, N = 39; for Instructor B, V = 
for Instructor C,)N = 68. ions exceed 

.* Differences between Z transformed correlation’ with 
twice the standard error of the difference. For example, ctor 
Tespect to Es, the correlation based on the sample for Instrtiy, 

for Instructor B is each significantly different fro 

Sore noe based, on the sample for Instructor C. 


> 
P</ 
"p< 1. 
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and do not admit to personal and psycho- 
logical shortcomings (RB 21). With re- 
spect to Instructor C, high achievers in the 
course rate it high in clarity (Gr .36). 
Factor V, fairness of evaluation, reflects 
one of the structural aspects of the course 
in that all three instructors used essentially 
the same procedures (examinations and a 
paper) and criteria for evaluating the 
students. As might be expected, the per- 
sonality correlates of this factor are mini- 
mal. In fact, only two variables correlate 
significantly at the .05 level with this fac- 
tor. These are Anxiety Level for Instruc- 
tor A and Grades for Instructor ©. The 
Anxiety Level correlation based on the 
sample for Instructor A (.86) is signifi- 
cantly different from the correlation based 
on the sample for Instructor C (—.08). The 


correlate for Instructor A suggests that 


the less anxious a student the higher he 
will rate the fairness of evaluation, For In- 
structor C, the better the course grade, the 
more favorable the rating on this factor. 
There are no significant correlates for In- 
structor B. 

The significant personality correlates of 
Factor VI, suitable class material and 
value of course, for the three instructors 
are presented in Table 8. 

Eight variables showed significant dif- 


TABLE 7 
CorrenaTus or Factor II, Cuarrry or Course, 
StanrricantLy DirrERENT FROM ZERO WITHIN 
AND/oR SIGNIFICANTLY DIFFERENT AMONG 
Instructor SaMPLEs 


Instructor fant 
Correlates ead 
A B c correlations* 
| 
Co —.3a* | —.10 12 | A versus C 
Au —.16 —.10 18 | B versus C 
RO —.08 —.24%* | —.09 
8E 87% —.01 —.05 A versus BC 
m —.39% | —.07 103 | A versus C 
50% 15 104 «| A versus BC 
AL 488° :23%° | —113 | AB versus CO 
PO +14 112 «| —.21 | B versus C 
MF 22 115. | —119 | AB versus C 
BB <45%* .2i* 104 | A versus C 
r. 14 06 :36** | B versus C 


Note.—For Instruc = 39; = 120; 
for Instructor CoN 39; for Instructor B, NV 4 


iC icant 
re m based on the sample for Tatras . 
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TABLE 8 
Correnatss oF Factor VI, Surtapne Marerian 
AnD Covrsse Vatuz, Sieniricanriy Drr- rm 
FERENT FROM ZERO WITHIN AND/OR 
Sientricantty DIFFERENT AMONG 
Instructor SAMPLES 


Instructor 


Note.—For Instructor A, N = 89; for Instructor A 
ee ee ae 


is 
on the sample for 
Sp < .05. 
"p< .0l. 


ferences in correlations among the three 
instructors. The measures having a signifi- 
cant relationship for Instructor A indicate 
that students rating his course favorably 
do not like ambiguous, uncertain situations 
(Co —.40), they do not admit to feelings 
of anxiety (AL .39), they are socially out- 
going (SE .38), and they are nonimpul- 
sive (IE —.32). The type of student per- 
ceiving the course given by Instructor B 
as high is practical in outlook (PO .24), 
religious (RO —.24), authoritarian (Au 
—,.19), uninterested in reflective, theoreti- 
cal thinking (TI —.18), and not esthetically 
inclined. (Es —.18). Finally, for Instructor 
C, high achievers in the course tend to 
perceive the materials as suitable and the 
course of value (Gr .41). 

With respect to the final course dimen- 
sion, Factor VII, frequency of evaluation, 
it is not surprising that none of the per- 
sonality variables correlate significantly 
with this factor. This dimension reflects a 
structural feature of the course which is 
constant across all three instructors—the 
same number of examinations and papers 
are required. It is, however, interesting to 
note that the correlations between this fac- 
tor and Autonomy (—.18, .18) as well as 
Practical Outlook (.16, —.17) are signifi- 
cantly different for Instructors B and Cc. 
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These differences in correlations probably 
reflect the fact that these structural con- 
stants of the course are viewed in a slightly 
different way across instructors because 
these aspects are experienced as parts of 
different wholes. 


All-Around Teaching Ability 


Before discussing the above results, it 
might be instructive to report, for each in- 
structor, some of the correlations with an 
item adapted from Isaacson et al. (1963), 
and which served as a primary variable in 
their study. The item is: How would you 
rate your instructor in general (all-around) 
teaching ability? The response alternatives 
are: (a) a very poor and inadequate in- 
structor, (b) a poor and inadequate in- 
structor, (c) an adequate but not stimulat- 
ing instructor, (¢) a good instructor, ($) 
a very good instructor, (f) an outstanding 
and stimulating instructor, and (g) a very 
outstanding and stimulating instructor. 

This item had the highest loading (.76) 
on the general factor of items prior to ro- 
tating to a varimax solution. The signifi- 
cant differences in correlations with this 


TABLE 9 


item, presented in Table 9, suggests tha 
this rating has a somewhat different psy. 

chological meaning from one instructor tg} 
another. Further, with regard to this rating 

of all-around teaching ability, many of the 

differences in correlations not reaching 4 

level of statistical significance appear di. 

vergent enough to suggest that this iten 

would load differently on a given factor if 

factor analyses had been performed sepa. 

rately for each of the three instructor j 
ples. 


Discussion anp ConcLusions 


By ignoring the particular context in 
which teacher and course ratings were ob- 
tained (i.e., using the total sample), it was 
possible to identify nine meaningful fac- 
tors representing aspects of teacher be- 
havior and course structure. Each factor is 
viewed as a positive component of an in- 
structional situation (ie. high factor 
scores are viewed as favorable instructor 
and course characteristics). It would 
seem, then, that it is possible to identify 
some of the general characteristics of 4 
good (or bad) instructor and course—at 


Correnarss or THe Ratna “Arr-ARrounp TEACHING Asruity” Suowrne Sienrricant D1rFERENCE 


AMONG THE THREE InsTRUCTOR SAMPLES 


Correlates 


Teacher ratings 
Knowledge of new developments in field 
Stimulates intellectual curiosity 
Sympathetic attitude toward students 
Sense of proportion and humor 
Course ratings 
Course well organized in meaningful sequence 
Assignments clear and reasonable 
Materials suited to class level 
ceed measures 


Es 
Co 
RO 
Gr 


Note.—For Instructor A, N = 39; for Instructor B, N = 120; for Instructor C, N = 68. 


Instructor Significantly 
jifferent 
A B c correlations* 
21 .53** 132* 
.53"* 15%* .61** | A versus B 
31 -83** 03 B versus C 
«35* .54"* -26* B versus 
-66** 53%* -28* A versus C 
49** torre -10 A versus 
-16 49** thd A versus B 
— 26 -01 -23* A versus C 
=17 —.09 -33** | AB versus 
—.27 —.04 18 A versus 
-ll —.30** | —.11 A versus B 
+20 -22* .57** | AB versus 


* Differences between Z transformed correlations exceed twice the ifferenc® 
standard error of the diffe 
For example, with respect to Es, the correlation based on the sample for Instructor A and for Instru 
tor B is each significantly different from the correlation based on the sample for Instructor C. 


*p < .05. 
a < 01. 
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, least as far as a course in educational psy- 
chology is concerned. A study of the per- 
sonality correlates of the nine factors 
across the three instructors, however, does 
not allow this unqualified conclusion. The 
data of the present study consistently 
show that the psychological meaning of 
these factors varies from instructor to in- 
structor, That is, there are significant dif- 
ferences in correlations between the per- 
sonality and factor scores from instructor 
to instructor; further, different personality 
characteristics are correlated with a given 
factor for different instructors. More spe- 
cifically, the type of student who tends to 
rate one instructor high on a given factor 
may be the type of student who tends to 
rate another instructor low on the “same” 
dimension. These results suggest that it is 
only by viewing these “abstract” factors in 
the light of the context in which they were 
obtained that their psychological meaning 
can be grasped. This important point can 
perhaps be more forcefully expressed by 
saying the meaning of the nine factors, 
considered from the perspective of the total 
sample, is analogous to the definitions of 
words found in a dictionary; the sense of 
these factors, as the expressive meaning of 
words, is determined not by the dictionary 
entries but by the contexts in which they 
occur. The instructor and the student are 
two aspects of the same situation; as such, 
the teacher’s behaviors and course organi- 
zation cannot be separated from the stu- 
dent’s reactions without a considerable loss 
of understanding. 

The most frequent personality corre- 
lates across the nine factors for Instructor 
A, who has a background and an interest 
i counseling and guidance, are Anxiety 
Level, Personal Integration, Response Bias, 
and Complexity (consistent negative cor- 
telation). A student scoring high on the 
first three measures and low on the fourth 
may be characterized as being well ad- 
justed, as having a positive self-regard, 
and as disliking ambiguities and uncer- 
tainties as well as novel situations and 
ideas, The most frequent personality corre- 
lates for Instructor B, who has an interest 
™m experimental and methodological pro- 


cedures and issues, are Practical Outlook 
and Religious Orientation (consistently 
negative). Thus, a student who tends to 
evaluate ideas and things in terms of 
immediate utility and who has strong re- 
ligious commitments will tend to evaluate 
Instructor B and his course favorably. 
Finally, the most frequent correlates for 
Instructor C, who has an interest in cer- 
tain philosophical issues and aspects of 
psychology, are grades in the course and 
esthetic interests. The student whose rat- 
ings of Instructor C will tend to be favor- 
able is a high achiever in the course with 
diverse interests in artistic matters and ac- 
tivities. 

It is only with respect to Instructor C 
that the relationship between ratings and 
course grades supports the finding by Wea- 
ver (1960) that student ratings of teachers 
are related to the grades the student ex- 
pected to receive in the course. Although 
actual course grade was used in the present 
study, all students knew their grades on 
two of the three examinations and on the 
paper at the time the evaluation form was 
completed. To speak of rater bias solely 
on the basis of a correlation between rat- 
ings and grades, as does Weaver, is unwar- 
ranted. 

So far the emphasis has been on differ- 
ences in factor correlates for the three in- 
structors, but there are interesting similar- 
ities. If the 15 factor correlates for each 
instructor are ranked from high positive 
through zero to high negative for each of 
the nine dimensions, Instructors A and B 
are positively and significantly correlated 
with each other on all nine factors. In- 
structor C is negatively correlated with the 
other two on all factors, although signifi- 
cantly so on only four of the nine. This 
analysis throws into relief what the reader 
has probably grasped from the results al- 
ready presented. 

The results of the present study do not 
mean that the ratings are distorted by the 
personality characteristics of the raters 
and thus lack objectivity. This conclu- 
sion follows only from a view which would 
claim the possibility of rating from no par- 
ticular point of view in no particular con- 
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text. Human perception requires a point of 
view and a context, and, in fact, they are 
part of the meaning of what is perceived. 
The data presented are consistent with this 
thesis. As Merleau-Ponty puts it, “The 
subject’s intentions are immediately re- 
flected in the perceptual field, polarizing it, 
or placing their seal upon it, or setting up 
in it, effortlessly, a wave of significance 
[1962, p. 131].” 

By keeping the subject in perception, it 
is not likely that we will view him as an 
impartial recorder of events, and itis 
likely that we will understand the perceiver 
by noting his points of view and the cor- 
related meanings of the situations in which 
we find him. 
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Bright older children accelerated in lower elementary grades were 
compared with nonaccelerants toward the end of 9th grade. Ss were 
22 children accelerated from Grade 2-4, 14 children accelerated from 
Grade 3-5, and 4 nonaccelerant groups: 27 bright younger children, 22 
bright older children, 21 average-ability younger children, and 23 
average-ability older children. On 6 tests of educational achievement, 
9 tests of divergent thinking, and 2 psychomotor tests, both accelerant 
groups were equal to or higher than the other 4 groups. The nonaccel- 
erated older bright children were higher than at least 1 of the accel- 
erated groups on 4 tests of educational achievement, 2 tests of 
divergent thinking, and 2 psychomotor tests. The accelerated groups 
participated in school activities, advanced classes, and varsity ath- 
letics, to about the same extent as the older bright nonaccelerants. 


The ideal of public education is to en- 
courage each child to learn as well and as 
fast as he can, commensurate with optimal 
personality development. Lack of complete 
success in achieving this ideal is shown in 
the widespread attention given to aca- 
demically talented children in the late 
1950s and early 1960s and, more recently, 
to educationally disadvantaged children. 
This attention to various groups of chil- 
dren clearly indicates that instruction has 
not yet become sufficiently individualized 
to provide well for each child. Therefore, 
special provisions must be made on & 
widespread basis for groups of children. 
One of many possible provisions for the 
gifted that offers effective utilization of the 
resources of the school with little change in 
grouping procedures, organization, and the 
like, is acceleration whereby the student 
completes 12 grades of school in less than 
12 calendar years. 

In 1960, a random half of all the 
bright, older pupils who met certain speci- 
fied criteria. and who had just completed 
the second grade in Racine, Wisconsin, 
were acéelerated to the fourth grade after 
a 5-week summer session. During this ses- 
sion, instruction related to the usual 
third-grade curriculum was given. Toward 
the end of the fourth grade, effects of this 
acceleration appeared to be entirely favor- 
able (Klausmeier & Ripple, 1962). During 


the summer of 1961 the random half of the 
bright, older pupils that had not been ac- 
celerated were given the opportunity to ac- 
celerate from Grade 3 to Grade 5 after 
participating in a similar 5-week session, 
but they were not studied further. How- 
ever, the first group accelerated from sec- 
ond to fourth grade was studied intensively 
again and was doing well toward the end 
of the fifth grade (Klausmeier, 1963). The 
present study considers the longer-range 
effects of the experience upon both groups 
of accelerants toward the end of the ninth 
grade. 


MeErtHop 


Subjects 


The 129 Ss (54 boys and 75 girls) were distrib- 
uted as shown in Table 1. The abbreviations used 
in Table 1 and below refer to groups a8 follows: 

Acc 2-4—Accelerated from second to fourth 

in 1960, currently ninth-graders. 

‘Ace 3-5—Accelerated from third to fifth grade 

in 1961, currently ninth-graders. Ny 
gSY—Nonaccelerated pupils of superior ability 
below the median age of normally 
progressing ninth-graders. i 
98O0—Nonaccelerated pupils of superior ability 
above the median age of normally 
progressing ninth-graders, WS 
9gAY—Nonaccelerated pupils of average ability 
below the median age of ninth-graders. 
9A0—Nonaccelerated pupils of average ability 
above the median age of ninth-graders. 

‘The students in the Ace 2-4 and Ace 3-5 groups 

were identified in the spring of 1960 and the others 
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TABLE 1 
Cuaractzristics AND DistriBution oF SUBJECTS 
Group 
Characteristic 
Acc 2-4 Acc 3-5 9SY 9SO 9AY 9AO 
Mean IQ’September, 1960 123.77 | 121.58 | 123.77 124.62 103.77 100.04 
Mean deviation IQ May, 1966 130.9 127.6 125.0 129.9 110.1 111.4 
Average age on September 1, 1965 13-5 13-5 13-11 14-3 13-11 14-4 
Male subjects 9 6 il 9 8 il 
Female subjects 13 8 16 13 13 12 
Total 22 14 Pa 22 21 23 


in the fall of 1960. At that time the Acc 2-4 and 
Ace 3-5 groups each had 16 girls and 10 boys. The 
same numbers of boys and girls comprised the 
other groups for statistical analyses although two 
alternate girls and boys were also identified in 
each group. In the present study the alternates 
were included in order to alleviate somewhat the 
natural exodus that had occurred during the six 
years since the study was started. As may be 
noted in Table 1, the smallest group is Acc 3-5 
(N = 14). This group is smaller than Acc 2-4 
partly because fewer were accelerated from Grades 
3 to 5 than from Grades 2 to 4. 

Also presented in Table 1 are Kuhlmann-Ander- 
son IQ scores as of September 1960 and the mean 
IQ scores in deviation form as obtained from the 
Kuhlmann-Anderson Test, Seventh Edition, Book- 
let H, Personnel Press, Inc., in May 1966. The dif- 
ferences among the first four groups were not 
significant at the 05 level in 1960 nor in 1966. 
It is interesting to observe that the mean IQ of 
each group increased. One cannot determine 
whether the increase is solely a test artifact or 
whether these groups actually increased in IQ 
above the national standardization sample. 

The mean age of each of the six groups when 
they began ninth grade is included in Table 1. 
Both Ace 2-4 and Ace 3-5 will be graduated from 
high school in June with a mean age of 17-2, The 
9SY and 9AY groups will graduate at a mean age 
of 17-8, while the 9SO and 9AO groups will finish 
with a mean age of 18-0 and 18-1, respectively. 


Instruments Used and Treatment of Data 


Five types of data were assembled on the 
pupils near the end of Grade 9. A brief description 
of each instrument follows. 

Educational achievement. The Tests of Aca- 
demic Progress, Grade 10, Form 2, Houghton Mif- 
flin Company, were administered to all subjects; 
Grade 10 of the test was given to provide an ade- 
quate ceiling for the ablest students. This test 
yields scores in six areas: social studies, composi- 
tion, science, reading, mathematics, and literature. 
Raw scores were converted to standard 7 scores 
(M = 50, SD = 10) by means of the test manual; 
Grade 10 students were used as the norm group 
in conversion. Thus both accelerant groups, now 
in the ninth grade, are reported as having standard 


score means of 52 on the reading subtest since 
this is based on tenth-grade norms. 

Ingenuity in problem solving. Form A, In- 
genuity, of the Flanagan Aptitude Classification 
Tests (FACT), Science Research Associates, Inc., 
was utilized as a measure of ingenious problem 
solving. Although this test has a large verbal com- 
ponent and a rather restrictive format for gauging 
“ingeniousness,” it nevertheless allows objective 
scoring. Hach item right is scored 1; a maximum 
score is 25. High school seniors were used to norm 
the test, with raw scores of 14, 18, 21, and 24, 
falling at the 50th, 75th, 90th, and 99th percentiles, 
respectively. 

Creative thinking abilities. Four instruments 
yielding eight scores were used: Alternate Uses, 
Form A, Expressional Fluency, Form A, Conse- 
quences, and Plot Titles, 0-1, Sheridan Supply 
Company. Each of the four tests was scored for 
fluency (number of relevant responses). Expres 
sional Fluency and Alternate Uses were also scored 
for flexibility (ie, number of relevant cate- 
gories of response), while Plot Titles and Conse~ 
quences were evaluated for cleverness of response. 
These tests were originally developed by Guilford 
and his associates and, although some of them are 
only recommended for experimental use, some 
available reliabilities are reported in accompany- 
ing manuals, Depending on the difficulty of the 
judgments to be made, a training session was held 
and then either two or three scorers, working inde- 
pendently, scored the tests. The available scores 
on each pupil were then averaged across scorers. 
The average interjudge reliability in determining 
each of the eight scores is reported in Table 2. 

Psychomotor abilities. Three tests of psycho 
motor abilities were devised and administered by 
Grace Piskula, Consultant in Physical Education 
for the Unified School District, Racine, Wisconsin: 
Zig-Zag Run, to measure agility and large muscle 
coordination (the fewer the seconds required 1 
complete the run, the higher the ability); Wall 
Pass, to determine eye-hand coordination 9m 
speed of reaction (the more hits of a wall with 8 
ball in a 15-second interval, the better the coordi- 
nation and reaction); and Standing Broad Jumps 
to judge leg strength and ability to coordinate 
body parts (scores reported in inches jumped). | 

Participation in school activities and specu! 
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TABLE 2 
Avprach INTERJUDGE ReELIaBILiTEs MAINTAINED 
py Scores or Tests oF CREATIVE 
THINKING ABILITIES 


Averas 
Test Score inte adee 
reliability 
ee 
essional Fluency Fluency 94 
ae 7 Flexibility ‘95 
ite Uses Fluency. (ideational) 97 
ae Flexibility , 395 
Plot Titles Fluency (ideational) 1.00 
Cleverness 62 
Consequences Fluency (ideational) 98 
Cleverness 85 
N= ee for each test except Consequences; for Conse- 
quences, V = 3. 


programs. Bach student involved in the follow-up 
was given @ questionnaire regarding his activities, 
both in and out of school. The responses on these 
were tabulated and the percentage of the group 
responding yes to each item was computed. The 
areas of interest to investigators will become ap- 
parent in the results section, and include matters 
such ag enrollment in “condensed” courses, honor 
roll lists, and participation in nonclass activities. 
A6X 2 analysis of variance (groups by sex) was 
run on each of the measures except those under 
school participation. The latter are reported and 
discussed as percentages. Where the difference 
among the six groups was significant at the 05 
level or beyond, a Newman-Keuls test was run 
to ascertain which groups were significantly higher 
or lower than the two accelerant groups. Differ- 
ences between sets of the nonaccelerant groups 
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TABLE 3 
Muans, SranpaRD DEVIATIONS, AND SrenrFicance oF F Ratios FOR 18 Mmasurus 
Significance 
Acc 2-4 | Ace 3-5 9s¥ 9s AY 9A0 Siena 
Measure 
sealcan Wve lo omc] sb/|ace) ssiljpats) is :) at ia? [eee AE 
Tests of A nen 
ol demi 
Real aaiin see es 50.6 | 8.4] 57.3 | 8.2| 58-1] 6.9) 620) &8 4r.o| 7.0] 47.9) 6.6] 0 | om 
Composition 8] Fo | bas| 77 | 67-0) 9-2) 6L4) 68 4.0| 7.3 | 48-0 | 6.3] 0D | OF 
ee Ga) e¢/e8) fa/g8] ce) ea] ta) ge) Flee) a) | 
ee ae] eee) 22/0] tra] ag) ga) algal age | a 
Flanagan Ingenuity Test 72 6 8) 33 | 158 | 3.2 | 17-4 $13 | 14.0| 3.6] 15.7] 2.6) -OL ns 
eee 
Bt 4.0) .0L ne 
t 4| r| 26] s03| 23] 88] 38] 78 
ae S20) 36) et 3-4) 78) 36] 'o.0| 26| 64] 3.8) 64) 58 ‘oor | OL 
Uses 
A “ol oa |as2| sa |2ne| 86/25) E425) FT| Os nm 
pileeblity 8) 3 32-0) G5 | 35.0 | 74 | 25.2 | 7.8 | 204 7.9 | 19:0) 7.1] 05 m8 
it 
a slanz| solma| ez [a7] S818) S31 3 | ts 
oSlsverness Oh 5 ao 43] OO] 13 | 33] 16] 16 10} 1.0} 1.0) .001 ns 
onsequences 
enensy. 9 | 57.7 | 23.8 | 57.0 | 17.4 | 49-7 | 18-7 | 0-6 13.6| na ns 
P Claverness a “3 3 40-9 | die | 7.2 | 26.9] 83 | 37.7 $4| 14.7| 4.9| -001 ns 
eho eebi '3 | 7 
Weve kos 5 | 24.6| 2.8| 24.6] 2-8) 2-0] 2: .4| 2.7) 28.6) 3.4) ne 
eae sa | 1.9} 10.7) 1.6 19 | i2.2| 2:6) 13 | 19) 15 27] 
67.6 | 10.6 | 67.6 4 | 10.2 | 74.8 :0| 7.8 | 73. 
——————————— tC 
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(9SY, 980, 9AY, and 9AO) are not presented in 
the interest of brevity. 


RESULTS 


The means and standard deviations for 
total males and females in each group are 
presented in Table 3, as well as the sig- 
nificance level of the F ratios for groups 
and sex, with an indication of whether 
males or females were significantly higher. 
The mean scores on the tests of educa- 
tional achievement are of interest directly 
because the test battery and the related 
norms were for tenth graders. A score of 
50 is equivalent to the median score at- 
tained by the tenth graders who comprised 
the national standardization sample. The 
mean scores for both Acc groups and the 
9SO group were above the tenth-grade 
median. A mean score of 60 is roughly 
equivalent to a percentile score of 85. 
Thus, the mean score of 62.09 made by the 
Acc 24 group in mathematics ‘indicates 
very high achievement. 

The differences among the means of the 
six groups were statistically significant on 
14 of the 18 measures. The four nonsignifi- 
cant differences occurred on the creative 
thinking battery—ideational fluency scores 
on Alternate Uses, Plot Titles, and Conse- 
quences—and on the Zig-Zag Run. Gener- 
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ally, the 9AY and 9AO groups had the low- 
est mean scores on the 14 significant 
measures while 9SO had the highest. Es- 
pecially to be noted is the fact that on 14 
of the 15 tests in the cognitive domain, the 
Q9SY group did not differ significantly 
from either Ace group, both of which had 
1 less year of schooling and were on the 
average 6 months younger than 9SY. Also, 
the 9SO group was significantly higher 
than both Ace groups on only two meas- 
ures and higher than one Acc group but not 
the other on four measures. Part of the su- 
periority of the 9SO group may be related 
to a somewhat different pattern of educa- 
tional experiences, to be considered more 
fully later. The comparison of the two Acc 
groups is also of interest because they 
had been accelerated at different times in 
their school careers. No difference be- 
tween these two groups was significant. 

The difference between the sexes, inde- 
pendent of the groups, was significant on 
seven tests. On the educational achieve- 
ment measures, males were significantly 
higher in science and mathematics, while 
the situation was reversed for composition. 
Girls were significantly higher than boys 
on the flexibility score of the Expressional 
Fluency test. Boys performed significantly 
better than girls on all the psychomotor 
tests (girls took significantly more seconds 
to run a specified distance and thus per- 
formed less well than boys), 

On three measures the Group x Sex in- 
teraction was significant: TAP Reading, 
TAP Mathematics, and the Standing 
Broad Jump (all p < .05). The girls in 
the Ace 2-4, 9SY, and 9AO groups scored 
considerably higher in reading than did 
boys; however, the boys were somewhat 
higher than the girls in the other groups— 
Ace 3-5, 9AY, and 9SO. No explanation for 
this interaction can be offered either in 
terms ef the composition of the groups ini- 
tially or their subsequent education. Both 
ability levels, superior and average, and 
both programs, accelerated and normally 
progressing, are equally involved. The 
Group x Sex interaction in TAP Mathe- 
matics was also significant. Here boys in 
all ‘groups, except 9SY, had higher mean 
scores than girls; however, the difference 


between the means of boys and girls varied 
markedly among the five groups. The sig. 
nificant Sex x Group interaction for the 
broad jump is related to the unequal dif. 
ferences between the mean scores of boys 
and girls in the various groups inasmuch as 
boys were higher than girls in all groups, 
The large differences between boys and 
girls occurred in the 9SO and 9AO groups; 
the smaller differences occurred in the other 
four groups comprised of younger children, 
This difference is explainable in terms of 
physical development, namely, on meas- 
ures of strength the difference between 
boys and girls increases with age. (The dif- 
ference in running speed also increases 
with age but was not sufficiently large in 
this study to produce a significant interac- 
tion.) 
Contained in Table 4 is a summary of 
the extent of participation by the various 
groups in school activities and special pro- 
grams. The compressed courses were formed 
by presenting the normal subject matter 
content for 2 years in a single year. Thus, 
2 years of social studies are combined in 
Grade 6, 2 years of mathematics in Grade 
7, 2 years of science in Grade 8, and 2 years 
of English in Grade 9, A student completing 
all four courses may be taking the equiv- 
alent of tenth-grade classes as a ninth 
grader. As can be seen in the table, about 
half of each accelerant group and the 9sY 
group had enrolled in these courses, while 
three-fourths of the 9SO group had. Thus, 
the 9SO group had studied more tenth- 
grade content. The difference in participa 
tion between the accelerant and 9SO groups 
is particularly evident in social studies an 
science, Participation by the 9AO and 9A 
groups in the compressed courses was véely 
limited. ; 
Related to the question of participation 
in condensed courses is that of enrollmen! 
in special summer school programs 10! 
enrichment. The percentages in Table 4 
denote that half of the Acc 24 group, ee 
of the 9SO group and about one-fifth of “8 
other groups (except 9AO with abou 
one-tenth) had attended at least one sum 
mer school session. However, the 9SO grouP 
attended about twice as many summer 
sions as Acc 2-4 group and four times 
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TABLE 4 
SuMMARY OF PARTICIPATION IN Scroon Actrvit1Es AND SPECIAL PROGRAMS IN JUNIOR Hiex 
Group 
(t= aA MDs adatom JAR Og teh 
980 9AY | 9AO 

Percentage hy took compressed courses: 

Social studies 68 29 

Accelerated math 73 57 is 7 5 5 

Accelerated science 41 36 30 82 5 0 

poet English 41 50 | 48 55 1o| 17 

verage 56 

Summer school: ‘g i i i , 

Percentage who attended one or more summer sessions 50 21 19 55 19 9 

‘Average number who attended per group member 64. | .29 | .48 | 1.23 | -19 | .09 
Honor roll: , 

Average percentage who attained annually 23 26 17 32 5 4 
Varsity teams: 

Average percentage who participated annually ww) UW 4 41 18 | 48 
Intramurals: 

Average percentage who participated annually 70 | 61 | 36 48 88 | 42 
Activities: 

‘Average percentage who participated annually 21} 19 | 22 36 17} 14 


many as Ace 3-5. This situation, coupled 
with that in the preceding paragraph, 
highlights the greater exposure to enriched 
content that the 9SO group received. 

The table also indicates the percentage 
of each group attaining the honor roll. The 
percentage presented here and in the three 
categories below is actually an average 
percentage; the participation over the 3 
years in junior high school has been aver- 
aged to present an annual figure. Thus, on 
the average, 23% of the Ace 2-4 group at- 
tained the honor roll each year, 26% of 
the Ace 3-5 group, etc, Although slightly 
more of the accelerated pupils attained 
the honor roll than did the 9S¥s, the 9SO 
group placed slightly more pupils on the 
roll than the accelerants. Attainment of 
the honor roll by average pupils was 
negligible. 

The male accelerants as a group partici- 
pated in fewer varsity sports than 9SY, 
980, and 9A0. However, the accelerants 
participated in more intramurals than any 
of the other groups, and their overall par- 
ticipation in sports is quite high as it was 
for all groups except 9SY and QAY. 
Whether less participation by intellec- 
tually able boys in sports as members of a 
varsity team is a strength or weakness of 
the program is debatable at this time. The 


relative proportions that complete first the 
baccalaureate and then graduate school 
programs is probably more important 
than is current participation in athletics. 

The last tabular entry reflects, again 
with an average annual percentage, the 
extent of participation in nine other ac- 
tivities (student council, school paper, club 
activities, cheerleading, science fair, talent 
show and other productions, orchestra, 
band, and chorus). As can be noted, some- 
what higher participation was maintained 
by the 980 group with about a third of the 
group being involved in each activity. All 
other groups had an average involvement 
of about one pupil of every five in each ac- 
tivity. From the amount of participation 
by the Ace 2-4 and Acc 3-5 groups, in com- 
parison with other groups, one may infer 
normal sociability and social develop- 
ment. 

Discussion 


‘At the end of the fourth and fifth grades, 
the effects of acceleration on the cognitive, 
psychomotor, and affective development of 
bright older children were considered gen- 
erally favorable. At the end of the fifth 
grade, the then 580 group was significantly 
higher than the ‘Ace 2-4 group on only two 
of eight scores of the Metropolitan Achieve- 
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ment Test, none of ten tests of divergent 
thinking, and one of six psychomotor 
tests. At the end of the ninth grade, the 
9SO group was significantly higher than 
the Acc 2-4 group on three of six edu- 
cational achievement tests, one of eight 
tests of divergent thinking, and two of 
three psychomotor tests. The increased 
superiority of the 9SO group in educa- 
tional attainment in the ninth grade, com- 
pared with the fifth, can be partially ac- 
counted for on the basis of their having 
taken more of the compressed courses for 
high-achievers during the junior high 
school years and also having taken more 
enrichment courses in summer programs. 
They had more opportunity to learn the 
subject matter included in the tenth- 
grade test battery used in this study. Their 
superiority in the physical measures prob- 
ably is associated with the greater differ- 
ence in physical development between the 
ages 14 and 13 in comparison with ages 
10 and 9, the respective nearest ages of the 
two groups as ninth and fifth graders. In 
turn, the superiority in physical develop- 
ment was represented by the 9SO boys par- 
ticipating in interscholastic athletics to a 
greater extent. 

The 9SY group is a critical comparison 
group at the junior high school level for 
they are 6 months older than the Acc 2-4, 
have had an additional year of schooling 
(at considerable additional cost), and are 
normally enrolled in the same classes with 
the Ace 2-4 and 9S8O groups. The Ace 2-4 
group is not. significantly different from 
the 9SY on 17 of the 18 measures; on the 
mathematics test, the Acc 2-4 was signifi- 
cantly higher than the 9SY group. The 
mean score for the Acc 24 group is ac- 
tually higher on all six educational achieve- 
ment tests, on the Flanagan Ingenuity 
Test, and on six of the eight creative 
thinking tests. The Acc 2-4 group has 
slightly lower mean scores on the three 
psychomotor measures. Further, there is 
little difference between the Acc 2-4 group 
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and the 9SY in a variety of school activi- 
ties, including enrollment in the compressed 
classes and participation in social clubs oy 
activities. The Acc 2-4 participate slightly 
less frequently on varsity teams but much 
more frequently in intramural activities, 
The preceding comparisons apply about 
equally well to the Acc 3-5 group; how- 
ever, they were not studied until the ninth 
grade. 

Not to be lost in the comparisons are 
the 9AO and 9AY groups. They too, have 
had an additional year of schooling and 
are considerably older than the accelerants, 
On five of six measures of educational at- 
tainment they are significantly lower than 
both Ace groups and lower than the Acc 
2-4 group on the other. On none of the 15 
measures in the cognitive domain is either 
of these groups superior to the Acc 24 
group and only the 9AO group is superior 
on one of the three psychomotor measures. 

These same groups will again be studied 
in two years and more final conclusions 
may be possible then. Based upon all the 
data collected toward the end of the ninth 
grade, the effects of acceleration are con- 
sidered completely desirable. Some bright 
older children should be accelerated dur- 
ing the elementary school years so that 
they become the younger high achieving 
members of their classes rather than re- 
maining the older members throughout 
their school life. One can predict with con- 
fidence that they will continue to be high 
achievers and to participate in many 
school activities throughout their high 
school years. 
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EDUCATIONAL PSYCHOLOGY 
An Application of Social and Behavioral Theory 


by LOUIS M. SMITH and BRYCE B. HUDGINS 

both of Washington University, St. Louis 

Knopf; 576 pages; 21 figures; 73 tables; $8.50 
This is a comprehensive analysis of the principles and substantive findings of educa- 
tional psychology. Although strongly emphasizing the practical problems which 
confront every prospective teacher, the authors also present the most significant 
educational theories and recent research. 


SOCIETY'S CHILDREN 
A Study of Ressentiment 
in Secondary School Education 


by CARL NORDSTROM, Brooklyn College 

EDGAR Z. FRIEDENBERG, University of California at Davis 

and HILARY A. GOLD, Brooklyn College 

Random House, 1967; SED 7; 224 pages; $2.45 paperbound 
“A probing, daring, exciting and disquieting book which should be required reading 
for every teacher . . Frederick E. Ellis, Western Washington State College 


MENTAL HEALTH IN THE SCHOOLS 


by THOMAS A. RINGNESS, University of Wisconsin 
Random House, 1967; 512 pages; $6.95 
“mental health” problems are treated as essentially 


A group approach in which are’ 
with through learning principles. 


problems in learning to be coped 


PSYCHOLOGY OF HUMAN ADJUSTMENT 


ly LESTER D. CROW, Brooklyn College 
Knopf, 1967; 640 pages; $7.95 


“An excellent book. Good, concise discussions, 
Andrews University 
An Instructor’s Manual is available. 


clearly presented.” —Mercedes Dyer 


NDOM HOUSE, Inc. 
BRED A. KNOPF, Inc. 


The College Department 
501 Madison Avenue 
New York 10022 


Le] —Harper & ‘Row 


1817 


PYSCHOLOGICAL FOUNDATIONS 
OF EDUCATION 


2nd Edition 


Morris L. Bigge and Maurice P. Hunt 


This well-documented text takes a semihistorical and comparative approach. 
The authors have reworked each chapter to update references, improve con- 
tinuity and style, and add new material. Part I, unique in scope, treats the 
biology, psychology, and sociology of human nature. Part II deals with child 
and youth development, both physiologically and psychologically. Part III, 
on learning theory, presents a more penetrating discussion of this topic than 
can be found in competing texts at present. Part IV demonstrates the practical 
classroom application of theoretical and factual material in the preceding sec- 
tions of the text. Instructor’s Manual. 603 pages. $9.95. Just published. 


PSYCHOLOGY IN THE CLASSROOM 
2nd Edition 
Rudolf Dreikurs 


This practical manual provides the prospective and in-service teacher with the 
background information and methods necessary to deal effectively with be- 
havior problems and learning deficiencies of students. Grounded in the philos- 
ophy of democracy and the socio-teleological approach of Adlerian psychology, 
this new edition enlarges on the significance and techniques of group approaches 
in the classroom, particularly the use of group discussions. 286 pages. Paper. 
$3.75. Just published. 


THEORIES OF COUNSELING 
AND PSYCHOTHERAPY 


C. H. Patterson 


More than 125 initial adoptions within the first year of publication 

This text presents summaries of fifteen major theories or points of view of coun- 
seling and psychotherapy, reviewed by the theorists discussed. Includes 
Alexander and French, Bordin, Ellis, Frankl, Grinker, Kelly, Dollard and 
Miller, Pepinsky, Phillips, Rogers, Rotter, Salter, Thorne, Williamson and 
Wolpe. Eleven of the fifteen theories are not available elsewhere in summary or 
condensed form. Extensive documentation and references. 578 pages. $9.75. 


LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY 
2nd Edition 
Herbert J. Klausmeier and William Goodwin 


‘Almost completely rewritten and thoroughly updated, this edition continues to 
emphasize the concept of emerging human abilities, thus integrating the treat- 
ment of growth and learning. Course outlines have been prepared for the in- 
structor, and suggest a variety of combinations of all or parts of each of the 18 
chapters. A Student Evaluation Guide is available to teachers. 720 pages. $8.95. 


A Student Workbook, by William Goodwin, Robert Conry, and Herbert J. 
Klausmeier is available for use with the text. $3.25. 


READINGS IN 
LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY 
Richard E. Ripple 


Primarily intended as a companion to LEARNING AND HUMAN ABILI- 
TIES: EDUCATIONAL PSYCHOLOGY, the READINGS are also appro- 
priate for use with any of the current textbooks in educational psychology. Of 
the forty-nine readings (all unabridged), over 60% appear for the first time in 
a book of readings. 596 pages. Paper. $5.50. 


FOUNDATIONS OF HUMAN BEHAVIOR 
Louis Kaplan 


Here is a clear, concise presentation of the hard core of facts, theory and re- 
search relating to behavior adjustment, allowing the instructor to develop the 
implications for everyday life. 368 pages. $6.25. 


STATISTICAL CONCEPTS 
A Basic Program 


Jimmy Amos, Foster Lloyd Brown, and Oscar G. Mink 


This brief, field-tested, constructed-response, linear program, with examples 
from psychology and education, presents fundamental statistics and measure- 
ment concepts in a manner especially useful to introductory courses in the be- 


havioral sciences. 125 pages. Paper. $2.95. 


__Harper & Row, Pablisbers 8 |B 334 Stet NX 


Important Studies in Education 


CURRENT ISSUES AND RESEARCH IN EDUCATION 
General Editor: Harry L. Miller, Hunter College 


This series encompasses and illuminates a wide range of new ideas, experimentation, and criticism 
in the rapidly expanding field of education. Each volume contains excerpts, summaries, and entire 
articles that survey the most recent research findings, comment on persistent issues, and, evaluate 
continuing experiments and new ideas for the future. The editors’ introductions provide the neces- 
sary historical background and underscore the relationships of the articles to general trends in the 
field. 

1967 paper, $2.95 each volume 


EDUCATION FOR THE DISADVANTAGED 
Edited by Harry L. Miller: 


This collection of 49 articles provides an over-all view of current attitudes, research, and contro- 
versy in the area of educating the disadvantaged child. The articles stress urban and “inner city” 
problems in the schools, and discuss the testing dilemma, experimental projects, curriculum issues, 
teacher training, and new approaches to desegregation. 302 pages 


THE PSYCHOLOGY OF EDUCATION 
Edited by Donald H. Clark, Hunter College 
Preface by Harry L. Miller 


The 52 selections in this volume represent the most important current research in the psychology 
of education. The topics covered include Head Start, intelligence, emotional resources, ability 
grouping, and teaching the dyslexic child. The editor places special emphasis on the significance 
of these issues for teachers in today’s schools. 288 pages 


ELEMENTARY EDUCATION 
Edited by Maurie Hillson, Rutgers University 
Preface by Harry L. Miller 


The 31 selections in this study discuss areas of importance in elementary education including the 
mathematics revolution, science, foreign language teaching, television in the classroom, and non” 
grading. The selections on reading include basal reading, reading at the kindergarten level, and 
children’s literature in the classroom. 330 pages 


SOCIAL FOUNDATIONS OF EDUCATION 


Edited by Dorothy Westby-Gibson, San Francisco State College 
Foreword by Harry L. Miller 


This collection of 49 articles presents a comprehensive examination of the social foundations of 
education from the impact of social change and the pressures on the child from kindergarte? 
through college, to the status of teachers and teacher organizations. 313 pages 


from The Free Press 
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THE COUNSELING OF COLLEGE STUDENTS 

Function, Practice, and Technique 

Edited by Max Siegel, Brooklyn College 

Foreword by Harry D. Gideonse 
‘This manual of practical approaches to college counseling concerns itself with the everyday realities 
of the counselor’s world. With original contributions by 18 authorities in the field, the book covers 
all the significant aspects of student counseling, focusing on function, practice, and technique with 
a depth of coverage not found in any other book on the subject. 
January, 1968 480 pages (approx) $0.95 


SCHOOL CHILDREN IN THE URBAN SLUM 
| Readings in Social Science Research 

Edited by Joan I. Roberts 
School Children in the Urban Slum is a collection of research studies and reviews of research in the social 
sciences chosen for their pertinence to educational problems encountered in urban slum schools. It 
js the result of work undertaken for Project TRUE (Teacher Resources for Urban Education) at 
Hunter College. Offering a complete survey of problems teachers face in today’s urban schools, 
the selections are drawn from anthropology, sociology, and psychology. 
1967 639 pages $7.50 

EDUCATION IN THE METROPOLIS 

Edited by Harry L. Miller and Marjorie B. Smiley, both of Hunter College 
Resulting: from the findings of Project TRUE, this collection of readings discusses the social and 
economic background of the disadvantaged urban student. The selections have been chosen to 
provide the information and concepts necessary to understand the current problems of the urban 
school, and present a basis for discussing solutions to these problems. The selections are taken from 
the works of Michael Harrington, James Baldwin, Herbert Gans, David P. Ausubel, and others. 
The study is illustrated with twelve photographs by Mark Feldstein of street life in the “‘inner city.” 
1967 303 pages paper, $2.95 


| Coming April, 1968—the companion volume to Education in the Metropolis 
POLICY ISSUES IN URBAN EDUCATION 
Edited by Marjorie B. Smiley and Harry L. Miller 
512 pages 
‘A New Free Press Paperback Edition— 


TEACHING THE TROUBLED CHILD 
By George T. Donahue and Sol Nichtern 


With a New Preface by the Authors 


A thoroughly tested program for the : 
school structure. ‘‘. . . a challenging and provocative approach. 


212 pages 


paper, $3.50 tent. 


education of emotionally disturbed children within the regular 
»_Childhood Education 
$2.45 tent. 


THE FR EE PRESS 
La Ga Ginan 
A Divito tame, New aa NY. 10022 


EDUCATIONAL PSYCHOLOGY 
IN THE CLASSROOM 

Third Edition 

By HENRY CLAY LINDGREN, San Fran- 
cisco State College. A solid revision of a popu- 
lar text. The Third Edition brings your 
students: A unique chapter on the slum child « 
Strong emphasis on the contributions of social 
psychology + A comprehensive treatment of 
field theory and its applications to educational 
practices and problems + A non-clinical con- 
sideration of mental health as it affects and is 
affected by problems of classroom learning. 
1967 686 pages $8.50 


READINGS IN 
EDUCATIONAL PSYCHOLOGY 


By HENRY CLAY LINDGREN. 


1968 Approx. 504 pages 
Paper: $6.95 


STUDYING THE CHILD 
IN SCHOOL 


By IRA J. GORDON, University of Florida. 
Gives a theoretical overview of child assess- 
ment in the modern school, as well as Practice 
in using a variety of techniques for assessment. 


1966 146 pages $4.95 


Cloth: $8.95 


Cloth: $4.96 
Paper: $2.95 


Gy) JOHN WILEY & SONS, Inc. 


A sampling of recent books from Wiley 


CREATIVITY: 
Its Educational Implications 


Edited by JOHN CURTIS GOWAN, San 
Fernando Valley State College; GEORGE D, 
DEMOS, California State College, Long Beach; 
and E, PAUL TORRANCE, the University of 
Georgia. Readings that show how the results 
of creativity research can be turned to use in 
the school, the classroom, and the counseling 
office. 
1967 Cloth: $7.95 
Paper: $4.95 


336 pages 


READINGS IN THE 
PSYCHOLOGY OF 
PARENT-CHILD RELATIONS 


Edited by GENE R. MEDINNUS, San Jose 
State College. 
1967 871 pages $4.50 
EDUCATION AND 


SOCIAL CRISIS: 


Perspectives on Teaching 
Disadvantaged Youth 


Edited ty EVERETT T. KEACH, Jr., 
WILLIAM &. GARDNER, ROBERT 
FULTON, all of the University of Minnesota. 
1967 418 pages Cloth: $7.95 Paper: $4.95 


PARENTS LEARN 
THROUGH DISCUSSION 


Principles and Practices 
; of Parent Group Education 
| By ALINE B. AUERBACH, formerly of 
| The Child Study Association of America. This 
} source book on the philosophy, goals and 
_ techniques of the group education method for 
| ” parents is written by an eminent authority in 
| parent education. Drawing upon her years of 
experience in developing and directing parent 
_ group education programs for The Child Study 
| “Association, she presents a detailed, practical 
| guide for setting up and conducting discussion 
‘ ‘groups for parents and expectant parents. 
_ Attention is also given to groups for parents 
‘whose children are physically or emotionally 
| | handicapped, unwed mothers, and adoptive 
| parents and others. 

Considerable stress is given to the group 

experience of parents from low socioeconomic 

and educational backgrounds. Much first-hand 

evidence shows how these and other parents 

make use of educational group programs when 
| they are clearly geared to their immediate 
Hl situations and present needs. 
» Although the group methods described here 
" pertain to work with parents, the principles 
| and techniques are applicable to many group 
|. programs. 1968 Approx. $84 pages $7.96 
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605 Third Avenue, New York, N. Y. 10016 


A HISTORY OF 
GENETIC PSYCHOLOGY 


The First Science of 
Human Development 


Edited by ROBERT E. GRINDER, University 
of Wisconsin. Traces concepts of fertilization, 
heredity, and growth from the early Greeks 
to G. Stanley Hall—the father of the child 
study movement in America. 1967 247 pages 


$8.95 


STUDIES IN 
COGNITIVE GROWTH 


By JEROME S. BRUNER, Harvard Uni- 
versity; with ROSE R. OLVER, Amherst 
College; PATRICIA M. GREENFIELD, et al, 
The first major theoretical assessment of the 
process of cognitive development in children 
since the pioneering work of Jean Piaget and 
his colleagues at the University of Geneva. 


1966 $438 pages $7.96 


CLASSROOM GROUPING 
FOR TEACHABILITY 


By HERBERT A. THELEN, University of 
Chicago. ‘The result of a three-year research 
investigation, this important new book pre- 
sents the rationale and procedure for “com- 
patibility” grouping and cites evidence of its 
validity. 1967 27h pages $7.60 
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Looking for better textbooks? 


Educational Psychology Third Edition 
By Glenn Myers Blair, R. Stewart Jones, and Ray H. Simpson, 


all, The University of Illinois 


This successful text, now in its Third Edition, deals with the problems con- ! 
sidered most important by teachers and school psychologists. It includes new 
material on the psychology of the teacher and teacher self-appraisal. The book 
is unusual for its in-depth treatment of psychology of adolescence, the social 
psychology of teaching, and diagnostic and remedial procedures. Principles of 
learning are constantly supported by experimental evidence. Psychological 
theories are illustrated by actual classroom examples. » 
Intended for use in undergraduate courses, the book presents a comprehensive, — 
thoroughly documented and highly practical approach to problems confront- 
ing teachers and psychologists. While it is of value to both, it is addressed 
particularly to the teacher in his role of responsibility for preparing the child 
to enter the complex, technological modern world. 
Clearly and imaginatively written, this book gives the reader practical, up- 
to-date help in engineering the learning process in a manner consonant with 
sound psychological theory. A Teachers’ Manual is available. 

1968, 704 pages, $8.50 


Readings in Educational Psychology Second Edition 
Edited by Victor H. Noll and Rachel P. Noll, both, Michigan State University 


This thoroughly revised collection of readings provides the student in educa- 
tional psychology with a source of significant and relevant literature in the 
field. Twenty-eight of the articles are new and reflect a changing emphasis 1» 
educational psychology. Especially written for this anthology are articles by 
Robert C, Craig, Robert L. Ebel, Ruth Strang, Elizabeth M. Drews, and 
William Gnagey. 


1968, 464 pages, paper, $4.25 
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Write to the Faculty Service Desk for examination copies. 


THE MACMILLAN COMPANY 2866 Third Avenue, New York, N.Y. 10022 


Us 
‘Aat’s happening in education 


PSYCHOLOGY FOR EFFECTIVE TEACHING 
Second Edition 


By George J. Mouly, University of Miami 


Revised and updated to highlight recent developments of major psychological signifi- 
cance, the Second Edition of this text emphasizes (1) the recent awareness of the 
crucial role of environmental influences in the development of one’s potentialities, 
(2) the new orientation toward the stimulation theories of motivation, and (3) the 
current interest in creativity and the multi-dimensional nature of the human in- 


tellect. 
January 1968 656 pp. $8.95 


EDUCATIONAL PSYCHOLOGY 

A Cognitive View 

By David P. Ausubel, University of Toronto 

The basic premise of this book is that educational psychology is primarily concerned 

with the nature, conditions, outcomes, and evaluation of classroom learning. Unlike 

most other books in the field, it does not treat educational psychology as & loose 
of learning theory, developmental psychology, mental hygiene, and edu- 

cational or psychological measurement, or as 4 simplified version of general psy- 

chology. 

May 1968 640 pp. 


THE PROCESS OF SCHOOLING 
A Psychological Examination 


By J.M. Stephens, University of British Columbia 


Professor Stephens explores the phenomena of learning as a process that somehow 
continues in spite of innovations and c . This point of view represents the 
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ition which is intended to provoke some different, new 
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at the educational work teachers are trying to do. 
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DISCOVERY AND EXPOSITORY TASK PRESENTATION 


IN ELEMENTARY 


MATHEMATICS! 


BLAINE R. WORTHEN* 


University 


This study compared 2 methods of task presentation 
primarily in terms of sequence characteristics, 432 5th- 


Ss were presented with 6 weeks of 


” matics through textlike experimental sequences introduced by class- 


of Utah 


which differed 
a c and 6th-grade 
instruction in elementary mathe- 


room teachers trained in both discovery (Treatment D) and exposi- 


tory (Treatment E) sequencing. 


Analysis revealed equal teacher 


adherence to prescribed teaching behavior models in each treatment. 


Treatment E proved superior on initial learning, 
was superior on retention and transfer of heuristics. 
Treatment E on a test of transfer. Given 


was slightly superior to 


equal time on the learning task, Ss 


while Treatment D 
Treatment D 


jn Treatment D proved superior 


to Ss in Treatment E on a majority of intertreatment comparisons. 


The past decade has been marked by 4 
continuing controversy over the relative 
eficacy of differential methods of task 
presentation which have been loosely re- 
ferred to as “discovery” and “expository” 
methods. Adherents of the discovery method 
argue strongly for its superiority as 


] *This investigation was supported by the Co- 
operative Research Program of the Office of Edu- 
cation, United States Department of Health, Edu- 
cation and Welfare—Project 2277 and constitutes 
part of the final report of that project (Della- 
Pina, Eldredge, & Worthen, 1965). The data col- 
lected for this investigation also served as an © 
sential portion of a master’s thesis (Worthen, 1965) 
submitted to the Department of Education, Uni- 
ee of Utah. For a more complete review of re- 
ted research and a detailed description of all 
Methods, analyses, instrument validation, results, 
ete,, the reader is referred to either of the above 
epee or the American Documentation Institute. 
curriculum materials used with both the dis- 
covery method and the expository method, instru- 
Ments used for evaluating the outcome of the ex- 
periment, and several tables of additional data 
ad been deposited with the American Documen- 
pis Institute. Order Document Number 9633 
dip ADI Auxiliary Publications Project, Photo- 
a ean Service, Library of Congress, Washing- 
Gan C. 20540. Remit in advance $26.25 for pho- 
na ae or $7.50 for microfilm and make checks 
pivable to: Chief, Photoduplication Service, Li- 
BERG of Congress. 
Now at the Ohio State University. 
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method of teaching and claim that “dis- 
covery learning” enhances retention and 
transfer of concepts as well as pupil moti- 
vation (Beberman, 1958; Bruner, 1960). 
Crities of the discovery method discount 
it as pedagogically impractical and argue 
that it offers little to the learner that can- 
not be offered equally well by good ex- 
pository teaching (Ausubel, 1961, 1964). 
Apparent support for each of these po- 
sitions can be found in the research litera- 
ture. For example, Swenson (1949) and 
Ray (1961) conducted comparative studies 
in which “discovery” methods produced 
significantly better results on retention and 
transfer measures than did “expository” 
methods. But Craig (1956), Kittell (1957), 
and Kersh (1962) found just the opposite 
to be true. Investigations by Hendrix 
(1947) and Gagné and Brown (1961) sup- 
port discovery as superior to exposition in 
terms of transfer of learning. But studies 
by Craig (1953) and Corman (1957) favor 
expository techniques on transfer tests. 
Perhaps the greatest factor which con- 
tributes to such equivocal research evi- 
dence is the differing specification among Te- 
searchers as to what they mean by such 
terms as “discovery,” “guided-discovery,” 


chological Association, Inc. 
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and “exposition.” Since these terms have 
not yet been reduced to generally accepted 
operational definitions, it is highly probable 
hat researchers working in what is nomi- 
nally the same domain are not actually in- 
vestigating the same phenomena at all, As 
Wittrock (1966) points out, in an excellent 
review and analysis of the literature on 
learning by discovery, semantic inconsist- 
ency in labeling differing treatments with 
the same name has been a major factor in 
precluding any important conclusions about 
learning or teaching from being drawn from 
such research. Many investigators have 
been primarily concerned with the amount 
and type of external guidance to which the 
learner is subjected. Others have been con- 
cerned chiefly with the role of verbalization 
in discovery-expository processes. Still other 
researchers have focused on feedback 
mechanisms or on rate of presentation. Such 
wide divergence in the variables controlled 
in various studies has led to investigation 
of widely differing facets of the discovery 
and expository processes and a consequent 
noncomparability of the results (Della- 
Piana, Eldredge, & Worthen, 1965). 

Previous studies have been almost wholly 
exploratory in nature, however, and it is 
not surprising that direct comparisons 
among the results are impossible at pres- 
ent. Much more exploration is necessary 
before it will be possible to discern which, 
if any, comparisons are legitimate. The 
relevant teaching-learning variables must 
be more fully identified and interrelated 
and systematic research conducted on in- 
teractions of method with differing types of 
teachers, pupils, and subject matter. Until 
this is done, no unequivocal pattern can 
be hoped to emerge from discovery-exposi- 
tory research, 

While some of the relevant variables 
have been extensively explored, others 
have received relatively little attention. 
One variable which has been largely ignored 
in previous “discovery” studies is that of 
the sequence characteristics of the learning 
tasks. It could be argued that the type or 
amount of external guidance or verbaliza- 
tion is no more important in concept for- 
mation than the sequencing of such guid- 
ance or verbalization. Certainly this aspect 
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of discovery teaching deserves investiga. 


tion in its own right. One major purpose 
of the present study was to take a first 
step in such investigation by describing 
and comparing a discovery method and an 
expository method which differed in terms 
of the sequence characteristics of the pres- 
entations. 

Most “discovery” studies have been 
conducted in a laboratory setting and con- 
sequently have dealt with small time 
samples, small numbers of subjects (Ss), 
and very discrete and often manipulative 
learning tasks. One might argue that such 


sampling of time, Ss, and tasks i8 SO Te 


strictive and limited in scope that any 
attempt to generalize the results to class- 
room learning or instruction would be sub- 


ject to serious question. It would seem ) 


that the results of a carefully-controlled 
classroom experiment where both time 
sample and learning task are representa- 
tive of typical school behavior and cur- 
riculum could be generalized to classroom 
practice with more confidence than could 
the results of the typical short-term lab- 
oratory experiment.’ A second major pul 
pose of the present 
the two instructional methods in a nat 
uralistic setting where the learning tasks 
and time sample approximated norm®l 
classroom conditions. Accordingly, certain 
concepts in elementary mathematics were 
selected as content for the two differing 
instructional sequences. These sequent! 
were presented to the Ss through textlike 
instructional programs introduced into the 
classroom by teachers trained in both ex 
perimental methods. 

The criteria used to measure the out: 
comes of instruction included the follow- 
ing: tests of initial learning, retention, 6 
transfer of the selected mathematical cons 
cepts; tests for transfer of heuristics; an 
measures of attitude toward the subie 
content. A complete listing and prief de 


a nate 
nted 


*The difficulty of controlling research 1D 
uralistic classroom setting has been docume 
(Bellack, Davitz, Kliebard, & Hyman, 1u6e) er 
165-168: McDonald, 1964, p. 542) and is acknor 
edged by the investigator. It would seem, howd 
that difficulty should not prevent attempts to ie 
productive ways to utilize the classroom 8° * 
search setting. 


study was to compare | 


“soription of these criterion measures ap- 
pears in the section “Tests and Measures.” 


Hypotheses 

"Although no concrete conclusions can 
be drawn from the hodgepodge of research 
which has been conducted on discovery 
Jearning, some plausible inferences can be 
made, 

One such inference is that short-term 
studies should tend to favor expository in- 
struction while long-term studies should 
favor discovery learning. This inference is 
based on the assumption that pupils have 
typically been trained to learn from ex- 
pository instruction and, hence, need a 
relatively longer period of time in which 
to learn to utilize the techniques necessary 
in discovery learning. Based on this as- 
sumption, and in view of the length of the 


E study, it was hypothesized that 


the discovery method used in this study 
(Treatment D) would produce superior 
tesulis to the expository method (Treat- 
ment H) on tests of initial learning, re- 
"tention, and transfer of the selected math- 
ematical concepts. 

‘An additional hypothesis was based 
partly on work by Kersh (1958, 1962), 
Which suggested that any advantages of 
discovery learning could best be explained 

~ interms of pupil motivation, and partly on 
the experimenter’s experience with pupils 
in the initial tryout of the instructional 
materials, During this tryout, pupils who 
Were exposed to the Treatment D instruc- 
tional sequences made considerably more 
Unsolicited statements indicating that they 
liked the “new materials” than did pupils 
exposed to the Treatment E instructional 
‘Sequences, It was hypothesized that Treat- 
tient D would produce superior results to 
Treatment E on measures of attitude to- 
Ward the subject content. 
i final hypothesis was based on a log- 
, - extension of interpretations by Such- 
thei (1962) and Della-Piana (1957) re- 
i @ to searching set and its effect upon 
fonceptual data processing. It was hypoth- 
if ued that Treatment D would produce 
* ea results to Treatment E on tests for 
insfer of heuristics. 
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Discovery AND Exposrrory TASK PReseNTATION 3 
Merxop 
Treatments 
Brief definitions of the experimental methods 
appear below. 


Discovery method (Treatment D). Treatment 
D is defined as a method in which verbalization of 
each concept or generalization is delayed until the 
end of the instructional sequence by which the 
concept or generalization is to be taught. The pupil 
is presented with an ordered, structured series of 
examples of a generalization. The sequence of pres- 
entation maximizes the possibility of the pupil for- 
mulating awareness of the generalization more 
readily than if the examples are randomly pre- 
sented. No explanation of the examples is given, 
nor is there any hint that there is an underlying 
principle to be discovered. The pupil, merely in- 
structed to solve the problems, is expected to ac- 
quire the mathematical concept, principle, or gen- 
eralization through an inference of his own. 

Expository method (Treatment E). Treatment 
BE is defined as a method in which the yerbalization 
of the required concept or generalization is the 
jnitial step in the instructional sequence by which 
the concept or generalization is to be taught, The 
mathematical principle is presented to the pupil 
and explained verbally and symbolically. The pupil 
works with examples of the principle or generaliza- 
tion only after the jnitial verbal and symbolic 
presentation. Particular attention is given to insure 
that practice is made meaningful by continual 
stress being placed upon the relation of the ex- 
ample to the generalization and upon “why” the 
generalization operates as it does, This is to mini- 
mize rote memorization of the principle by the 


pupil. 
Subjects 


the elementary schools in the district in terms of 

socioeconomic and geographical characteristics. 
The teachers were selected on the basis of the 

following criteria: (a) mathematical and general 


+A control group, comprised of 106 pupils in 3 
sixth-grade classes, received both the pre- and 
posttests but received no special instruction during 
the intervening 6-week period. This group was 1n- 
cluded in the study in order to provide normal 
baseline data against which to assess effects of the 
two experimental treatments. Results of the inter- 
treatment comparisons between the experiment f 
groups and the control group appear in detail in 
previous reports of this research but are omitted 
here in the interest of brevity. It should be noted, 
however, that the results of these comparisons sup- 
port the findings and conclusions reported herein. 
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teaching competence as judged by supervisors, (b) 
minimum of 3 years of teaching experience, and 
(c) willingness to participate in this research proj- 
ect. The selection of the teachers determined the 
selection of the sample; Ss used in this study were 
pupils in established classes of the selected teach- 
ers. 


Experimental Design and Controls 


In each of eight schools, two classes were taught 
arithmetic by the same teacher, one class by 
Treatment D and one class by Treatment E. This 
was done in an attempt to control the dimensions 
of teacher personality and other teacher charac- 
teristics. Seven of the teachers taught two sixth- 
grade classes each while the eighth teacher taught 
two fifth-grade classes. Seven of the eight teachers 
taught their own homeroom class as one of the ex- 
perimental groups. In an attempt to control possi- 
ble differential in pupil-teacher interaction between 
homeroom and nonhomeroom classes, the number 
of homerooms receiving each treatment was bal- 
anced as nearly as possible. The assignment proce- 
dures also balanced as nearly as possible the num- 
ber of classes receiving each treatment during any 
particular segment of the school day. Although 
there was no reason to believe that the selection 
and assignment procedures would bias the sample, 
a preliminary inspection of the mean values for 
each treatment group was conducted on several 
pretreatment measures including IQ, arithmetic 
computation skill, arithmetic problem-solving abil- 
ity, prior knowledge of the selected mathematical 
concepts, prior attitude toward arithmetic, and 
pupil perception of teaching behavior. The only 
significant differences found between the experi- 
mental groups were on the attitude measures. The 
Ss in Treatment E entered the experimental period 
with significantly more favorable attitudes toward 
arithmetic than Ss in Treatment D. 

The major nonexperimental variables controlled 
in this study are presented below. 

1. The Ss in Treatments D and E received the 
same length of time to work on the learning tasks. 

2. The amount of verbalization in the teachers’ 
oral presentation and in the written instructional 
materials was held constant in both treatments. 
Verbalization of the mathematical generalizations 
varied in sequence between the two treatments but 
was present in both. 

3. Three techniques were used in this study in 
an attempt to assess the extent to which the teach- 
ers taught by the specified methods. These tech- 
niques (utilizing instruments described later) in- 
cluded the following: (a) rating by observer-raters 
of a 10% sample of the total teaching behavior of 
each teacher in each treatment; (b) rating of a 
10% sample of total teaching behavior of each 
teacher in each treatment from lessons recorded 
on audio-tape; and (c) rating by pupils of teach- 
ing behavior on the discovery-expository dimen- 
sion. 

4. All of the procedures and methods utilized 
in this study were selected in an attempt to mini- 


mize or negate any differential “Hawthorne effect’ 
between the two experimental groups. Directions 
given to both treatment groups were as nearly 
identical as possible. All procedures used in one 
treatment were used in the other, and both groups 
were made to feel that they held shared status 
with relation to the “new math project.” 

5. An attempt was made to equalize the pre 
experimental mathematical experiences of all 8s in 
Treatments D and E by presentation, during a 2 
month period immediately preceding the pretests, 
of an unit which included both specific and general 
mathematical concepts judged to provide neces 
sary background for the experimental materials, In 
addition, confounding of the experimental results 
by nonexperimental arithmetic experiences was 
minimized by a request that no homework or out- 
of-school arithmetic assignments be given. District 
personnel complied with this request and alo 
elicited parental cooperation. 

The experimental period, 40 minutes daily for 
each treatment, consisted of 3 days of pretests, 
6 weeks of instruction, and 5 days of posttests. 


Training Program for Teachers and Raters 


All raters and teachers attended a training class 
which met from 2 to 6 hours weekly for 20 weeks, 
18 weeks prior to and 7 weeks during the exper!- 
ment. Training was given in: (a) general mathe- 
matical concepts necessary as background; (b) all 
mathematical concepts used in the instructional 
materials and criterion measures; (c) use of the 
two specific methods of instruction; and (d) pr0- 
cedures for administering and scoring the vamious 
tests, scales, and questionnaires. 


Instructional Materials 


The instructional materials presented several 
mathematical concepts selected on the basis of 
suitability for both discovery and expository teach 
ing and probable unfamiliarity to Ss at the inceP 
tion of the study. The concepts selected were the 
following: (a) notation, addition, and multiplica- 
tion of integers (positive, negative, and met0)} 
(b) the distributive principle of multiplicatiog 
over addition; and (c) exponential notation an 
multiplication and division of numbers expres 
in exponential notation. 

The materials were equated in terms of ue 
mathematical concepts, diagrams of physical mo f 
els, number and type of examples, and degree . 
verbal presentation used in each treatment. ; 
two sets of materials differed primarily in te™™ 
of sequence characteristics. 


Instructional Procedures 


The instructional procedures in each treatment 
were largely determined by the requirement ie 
the teachers follow the structural sequences ©” 
instructional materials. However, some asp et 
teaching behavior were judged to be indepen’ ti 
of the instructional materials but still inure 
in maintaining the essential sequence differen 
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between the two treatments. The characteristics of 
teaching behavior which seemed most operative in 
this regard include the following: (a) interjection 
of teacher knowledge, (b) introduction of generali- 
gation, (c) method of answering questions, (d) 
control of pupil interaction, and (e) method of 
eliminating false concepts. Model “discovery” 
teaching behavior and model “expository” teach- 
ing behavior on each of these five characteristics 
was specified and a paradigm of teaching tech- 
niques for each characteristic was established in 
each treatment. A brief summary of model teach- 
ing behavior for each treatment on each of the 
a characteristics of teaching behavior is given 
below. 

1. Interjection of teacher knowledge—Treat- 
ment BH. The teacher acts as the primary source of 
Jmowledge concerning arithmetic. The students 

may depend on the teacher when they cannot work 
a problem. The teacher always indicates that he 
will show the student how to work the problem 
correctly. He is never doubtful or uncertain. The 
teacher checks the answers of the students and 
_ shows them how the correct answer is obtained by 
use of the principle involved, When an incorrect 
answer is given, the teacher recognizes it and im- 
mediately asks the student if he is certain that his 
answer is correct. This gives the student an op- 
portunity to correct his own mistake. If the stu- 
dent is unable to do so, the teacher asks for some- 
one else in the class to respond. Care is taken, 
however, to avoid any negative evaluation of the 

__ student’s response. 
4 Treatment D. The teacher does not act as the 
primary source of knowledge concerning arith- 
metic, but seems to depend upon the students to 
help him work the problems. He sometimes reflects 
4n uncertain feeling about the precise way to solve 
_ 4 Particular problem. When given an answer, 
jee correct or incorrect, he checks it by the 
long method as if he is not aware of the principle 
t Which would allow allow for solution by a “short- 
cut,” When a student gives an incorrect answer, 
‘the teacher continues on to the next problem as if 
ao that the answer is incorrect. When a stu- 
ent points it out, the teacher acts surprised. If the 
students fail to notice the mistake, the teacher 
ee back a short time later, as if he has just no- 
ted it, and questions the correctness of the earlier 
os The student who gave the response is 
lowed an opportunity to correct it. If he is unable 


to do so, other members of the class are asked to 
Tespond, 


2. Introduction of generalization—Treatment EB. 
ee teacher gives the generalization (rule) before 
HS cents are given any examples. All examples 

i en related back to the rule for solution. 
cae D. The teacher delays the verbaliza- 

i of the generalization until all, or nearly all of 

pases have made the discovery. He is care- 

Ean no hint that there is a “shortcut” to 
King the problems. He also takes care to avoid 
A Gee of vocabulary terms related to the general- 


3. Method of answering questions—Treatment 


E. The teacher answers questions by reiterating 
and explaining the rule and relating it to the ques- 
tion. The teacher then gives examples which will 
further clarify the way the rule is used in the solu- 
tion of that type of problem. 

Treatment D. The teacher answers questions by 
referring to the model or the computational se- 
quence which the student has used. If a student is 
still confused, the teacher takes him back to the 
model and goes through it carefully. The teacher 
may make use of sequenced examples as a clue, 
but no verbal hint of the rule is given. 

4. Control of pupil interaction—Treatment E. 
The teacher allows the students to share ideas 
about arithmetic. He encourages them to help 
each other if they can show the other person how 
to apply the rule. Often he allows them to check 
an answer with their neighbor but insists that they 
do their own work. 

Treatment D. The teacher always prevents the 
students from sharing ideas about arithmetic. The 
teacher insists that no student does anything which 
may inhibit another child’s discovery by giving 
the rule to him prematurely. If a student does 
verbalize the rule, the teacher expresses displeas- 
ure. 

5. Method of eliminating false concepts—Treal- 
ment E. The teacher warns the students of com- 
mon errors made in applying the principle. He 
points out specifically the types of problems the 
students are likely to make errors on and then 
gives examples of each kind of error. 

Treatment D. The teacher includes “trap ques- 
tions” and gives no verbal warning of any type. 
The teacher purposely leads the child, through 
sequencing of examples, into overgeneralizing the 
rule. If the problem is missed, the teacher waits 
until the error is mentioned, then checks to make 
certain it is wrong. He says nothing about the rule 
or why the answer was incorrect. 

Adherence to the model techniques of teaching 
specified for each treatment and to the sequence of 
presentation determined by the instructional mate- 
rials was assessed by observer and pupil rating 
scales (described hereafter). Scores on these scales 
were used as an index of teacher fidelity in the 
presentation of the treatment. 

Because of the wide range of ability among and 
within classes, teachers were allowed to vary their 
rate of instruction in order to fit the needs of their 
particular class. Although the total time consumed 
by each treatment was held equal, how far each 
class progressed in the instructional materials was 
dictated by the rate of instruction. Teachers were 
required to cover each concept and principle care~ 
fully, using the prescribed teaching techniques, fol- 
lowing the sequence dictated by the materials, and 
making every attempt to make both treatments 
equally meaningful. In order to insure adequate 
presentation of the concepts to both treatment 
groups, the criterion was established that a mini- 
mum of 85% of each class must attain a specified 
minimum level of understanding of each concept 
before the teacher was allowed to proceed to the 
next concept. Criterion items for each concept were 


TABLE 1 
Reviasiuity Estimates ror Seven INSTRUMENTS 


Reliability 
Instrument Test-retest Parallel form 
Consec, | 6-week | Consec. | 6-week 
days | interval} days | interval 
Semantic Differ- 
ential Attitude 
Scale 87 44 
Statement Atti- 
tude Scale 78 75 
Pupil Perception 
of Teaching 
Behavior 92 92 
Concept Knowl- 
edge Test 70" | .78* 
Concept Reten- 
tion Test 758 -75" | 69" 
Concept Transfer 
Test 82 
Negative Concept 
Transfer Test -64 


Note.—For the consecutive days, N = 57, for 
the 6-week interval, NV = 106. 

* Reliability coefficients on these instruments 
are somewhat difficult to interpret. Because the 
content measured by these instruments was gen- 
erally unique to fifth- and sixth-grade pupils, only 
those pupils used as Ss in the experimental treat- 
ments and thus to the content were able 
to score well consistently. Pupils in the reliability 
reference groups achieved very low scores and 
exhibited little variation. In view of the direct 
relationship between the magnitude of a reliability 
coefficient and the variation of the sample on 
which it is based, the reliability estimates reported 
for these instruments were judged to be com- 
pletely acceptable, 


built into the materials to enable this test to be 
made. The application of this criterion resulted in 
some variation among schools in the amount of 
the instructional materials completed during the 
experimental period. There was virtually no varia- 
tion between treatments, however, when summed 
across schools. In seven of the eight schools, the 
number of instructional units completed by the 
two treatment groups was equal. In the eighth 
school, the Treatment E class completed two pages 
more of the instructional materials than the Treat- 
ment D group in the same period of time, This 
erence was discounted b; experi! 
inconsequential, re may 


Tests and Measures 


Ten instruments were devel i 
nine of which were inii Abel ah phat nerd 
tenth was used to rate teacher behavior. Reliability 
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coefficients for seven of these instruments appoy 
in Table 1. Reliability of the other three instr. 
ments is discussed below in connection with th 
description of those instruments. Additional relig. 
bility information and material related to validity 
of the instruments can be found in the original 1 
port of this research (Worthen, 1965). 

Prior knowledge of the selected mathematical 
concepts was measured by a test (Concept Knowl 
edge Test) administered to both treatment groups 
in the pretest series. Initial learning was measured 
by the four subsections of this same test adminis 
tered at the completion of the corresponding sub. 
section of the instructional materials. A parallel 
form of this test (Concept Retention Test) was ad- 
ministered to both treatment groups 5 weeks and 
11 weeks after the first administration in order to 
measure retention. Both instruments were alw 
administered to the reliability reference groups. 
Scores on these two forms were correlated and the 
coefficients, reported in Table 1, bear out the claim 
of parallelism. The correlations between forms ar 
approximately equal to the reliability coefficients 
reported for either form. 

A positive transfer test (Concept Transfer Test) 
was administered to both treatment groups inthe 
posttest series and was used to evaluate Ss’ abilily 
to recognize and apply mathematical principles in 
situations unlike those in which they were orig 
nally presented. A negative transfer test (Negative 
Concept Transfer Test) was added in order to 
assess Ss’ tendency to overgeneralize the principle: 
to inappropriate situations. 

Transfer of heuristics was measured by two 
tests. The first of these (Written Heuristic Tram 
fer) was designed to assess the effects of the two 
treatments on Ss’ ability to discover a mathematt 
eal principle on a “paper and pencil” task com 
prised of a series of written problems, each of 
which could be solved easily if S discovered 
“shortcut.” The second test (Oral Heuristic Tran 
fer) consisted of a sequence of problems preset! 
orally by the teacher, each of which could 
solved easily if S discovered the “shortcut.” 
this test, final criterion behavior was dete 
by performance on a six-problem paper and 
exercise. The major difference between the. 
was in the mode of presentation (oral or writ! 
of the initial problem sequence. Both of these test 
were administered in the posttest series to 
both treatments. Because of the nature of be 
two instruments, it was impossible to obtalt 
reliability coefficient by test-retest, split-hall, 
any usual reliability technique.’ The tests wa 
however, administered to a convenience sample 
54 fifth- and sixth-grade pupils. These pup 
matched as nearly as possible on IQ and ani 


pendl 


tit 


achievement scores, resulting in 27 pairs of wy ie 
scores of the matched pupils were then a 
lated for each test, with resulting correlatio! 


89 for the Written Heuristic Transfer test 2” ‘ 
*For a discussion of problems associated 


assessing reliability of instruments of this Rar) 
Teader is referred to Thorndike (1951, pp- 6 
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‘ for the Oral Heuristic Transfer test. These coeffi- 


cients were used as crude estimates of the reliabil- 
‘ity of these instruments. 

Pupil attitude toward arithmetic was assessed 
by two attitude scales administered in the pretest 
series and again in the posttest series to Ss in 
both treatments. The first of these (Semantic Dif- 


ferential Attitude Scale) was constructed along 


the principles of semantic differentiation developed 
by Osgood, Suci, and Tannenbaum (1958). All of 
the scales used in scoring this test were scales 
which received repeated high loadings on the eval- 
uative factor in several of Osgood’s studies (Os- 
good et al., 1958, ch. 2). The second scale (State- 
‘ment Attitude Scale) was adapted from a similar 
instrument used by Umemoto and Haslerud (1955) 
and elicited expressions of pupil agreement or dis- 
agreement with both favorable and unfavorable 
Statements concerning arithmetic. Since the two 
scales were judged to measure slightly different 
‘dimensions of attitude,’ each S’s scores on the two 
Scales were summed into a total attitude score 
(Total Attitude Scale). 
i Tn addition to these criterion measures, two 
instruments were used to assess the degree to which 
teachers adhered to the prescribed teaching model 
in each treatment. The first of these (Pupil Per- 
ception of Teaching Behavior) was a forced-choice 
questionnaire which elicited pupil responses to 
statements about teaching-behavior characteristics 
of their teacher. The statements reflected the five 
ing characteristics of teaching behavior 
Which were used in defining the expository and 
discovery teaching models described earlier. Each 
Statement could be classified on the basis of which 
of the five “teaching-behavior characteristics” it 
Tepresented. In addition, the statements could be 
| into items which, if answered affirma- 
tively, would typify pupil perception of “low dis- 
Covery” teaching behavior, and items which, if 
answered affirmatively, would typify pupil percep- 
tion of “high discovery” teaching behavior. 

The Scoring system, adapted from a similar 
pupil questionnaire reported by Shaw (1963, p. 3), 
Was scaled so that a teacher-index score of 100 re- 
flected pupil perception of maximum “high dis- 

-low expository” teaching behavior, while a 
index score of 0 reflected pupil perception 
maximum “low discovery-high expository” 
ior. 
ve order to assess pupil perception of teaching 
bey Wior on the discovery-expository dimension, 
Poth before and after the experimental period, this 
4 was administered in both the pre- and 
series. The pretest scores of necessity re- 
flected Pupils’ “jefagpaen of the teacher’s sean 
u in teaching arithmetic prior to the ex- 
eae period. It was predicted that typical 
vior, as measured by the index scores, 
Would be found to combine some elements of both 
DE 


Weng intercorrelation (Pearson r) between them 
54 on a consecutive day test-retest. 
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discovery and expository teaching but would re- 
semble most closely the expository teaching model. 

The posttest index scores obtained by use of 
this instrument should have reflected, in some 
measure at least, pupils’ perceptions of the teach- 
er’s typical behavior in teaching arithmetic during 
the experimental period. Inasmuch as the rating 
scale was constructed so that the “pure discovery” 
model would be ranked as 100 on the scale and 
the “pure expository” model would be ranked as 
0, it was predicted that the pre- to posttest gains 
in teacher index scores would show the following 
trends: (a) a positive pre- to posttest gain in 
Treatment D, and (b) a negative pre- to posttest 
gain in Treatment E, 

The second instrument used to assess teacher 
fidelity to the prescribed models of teaching was a 
rating scale (Observer Rating of Teaching Be- 
havior) used to rate teaching behavior through 
classroom observation and rating from audio-tape 
recordings. All raters were thoroughly trained in 
the selected mathematical concepts and in both 
treatment models during the training program de- 
scribed earlier. The percentage of rater agreement 
on this instrument was defined as the percentage 
of the total pairs of ratings assigned by two or 
more raters (rating the same teacher on the same 
lesson) which were in perfect agreement. Multiple 
rating of a lesson consisted of three modes; (a) 
instances when two observer-raters simultaneously 
visited the same classroom; (b) instances when an 
observer-rater rated a lesson which was also audio- 
taped, transcribed, and rated by a tape-rater; and 
(c) instances when a tape transcription was rated 
by all raters, The overall rater agreement obtained 
from these methods was .76. More definitive in- 
formation on percentage of rater agreement for 
each mode is contained in the original report of 
the study (Worthen, 1965). 

Ratings on the Observer Rating of Teaching 
Behavior scale were subdivided into two cate- 
gories: (a) ratings on those items used to differ- 
entiate between discovery and expository teaching, 
and (b) ratings on items used to assess general 
teaching behavior. Data yielded by items in the 
latter category are not discussed in this report, but 
are included in the previous reports of this re- 
search. The items in the former category reflected 
the five teaching behavior characteristics used to 
define the treatment models, Comparisons of the 
mean ratings for each teacher in each treatment 
on these items were used to determine teacher 
fidelity to these prescribed models. Scoring of the 
rating scale was designed so that perfect adherence 
to the discovery model would have theoretically 
resulted in a mean teacher rating of 5.0, while per- 
fect adherence to the expository model would 
theoretically have resulted in a mean teacher rat- 
ing of 10. 

The Pintner Intermediate Test; Form A (IQ) 
and the Metropolitan Achievement Test, Tests 5 
and 6 (arithmetic computation and arithmetic 
problem-solving) were used as measures of group 
comparability. 
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TABLE 2 
Mean Teacuer Raines on OsseRveR Ratine or Teacuinc Bunavior Scary: 
py TEACHER AND TREATMENT 


Teacher 
Treatment 
1 2 3 4 5 
D 4.76 | 4.68 | 4.73 | 4.32 | 4.38 
E 1.40 | 1.82 | 1.20 | 1.56 | 1.30 
Resvurs 


Analysis of Teaching Behavior 

Many investigations in which each 
teacher presented the learning task by two 
or more methods have been subject to the 
criticism that the teachers were unable to 
vary their teaching behavior sufficiently 
to effect a real test of the various treatment 
models. In order to obviate such criticism 
of the present study, two instruments de- 
scribed earlier (Observer Rating of Teach- 
ing Behavior and Pupil Perception of 
Teaching Behavior) were used to gather 
data on teaching behavior which might be 
characterized as “discovery” or “exposi- 
tory” in nature. These data were analyzed 
by use of the standard analysis of variance. 

Summary of Observer Rating of Teach- 
ing Behavior. Table 2 shows the mean 
rating by teacher and treatment for this 
instrument. 

Four analyses of variance were carried 
out on these data. The first analysis of 
variance compared mean teacher ratings 
in Treatment D with mean teacher ratings 
in Treatment E and yielded a highly sig- 
nificant difference between the two experi- 
mental treatments. This analysis confirmed 
the notion that the instruction given to 
both groups was consistently dissimilar. 

The second analysis of variance com- 
pared the proximity of the obtained mean 
teacher ratings in each treatment to the 
perfect score for each of the theoretical 
teaching models. No significant differences 
were found in this comparison, thus sub- 
stantiating the idea that instruction in 
both treatments followed the prescribed 
teaching models equally well. 

The mean teacher ratings were also 
used to compare teacher differences within 


Perfect 
‘Total treatment |} coretical eal 


6 7 8 
4.58 | 4.73 | 4.45 4.56 5.00 
1.56 | 1.20 | 1.68 1.41 1.00 


treatments. One analysis of variance com- 
pared the mean ratings for each teacher in 
Treatment D, while the second analysis 
compared the differences among the mean 
teacher ratings in Treatment E. No sig- 
nificant differences between teachers were 
found with either of these analyses. 

Summary of Pupil Perception of Teach 
ing Behavior. This instrument was used in 
an attempt to assess pupil perception of 
teaching behavior on the discovery-exposl- 
tory dimension both before and after the 
experimental period. The mean pretest m- 
dex score for each teacher in each treatment 
is reported in Table 3. Although the pre 
diction that typical teaching behavior 
would resemble most closely the expository 
teaching model was borne out (total treat- 
ment pretest means of 42.9 and 44.5), the 
pretest means approached rather closely 
the theoretical mean of 50 which would 
reflect discovery and expository teaching 
in equa] proportions. 

This questionnaire was scaled s0 i 
the pre- to posttest gain score for ead 
teacher in each treatment could be us 
as an index of the teacher’s adherence t@ 
the teaching model. In Treatment D, hi 
fidelity to that model of teaching sho 
have resulted in a positive pre- to ae 
gain score. In Treatment E, high fide #! 
to that model of teaching should have a 
sulted in a negative gain score. The ei, 
pre- to posttest gains for each teacher ; 
Treatments D and E are also presen 7 
Table 3 and show a definite gain for 
experimental treatment group in the P) 
dicted direction. 

An analysis of variance which compa 
mean teacher gain scores in the bp fe 
ments revealed a highly significan' ‘0i): 
ference (F = 25.59, df = 1/398, P <° 
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TABLE 3 


Maan Prerest, Posrrest, anD GAIN Scores, aND StanpDARD Deviations ON Purtn PeRcupTion oF 
Tracuine Bunavion: py TEACHER AND TREATMENT 


——_———————— I cc ccc 


Teacher 
| ‘Treatment pretest, erties. "eac’ te 
1 2 3 4 5 6 7 8 ae 
res Pre 
M 35.4 | 38.6} 43.2) 47.9 | 43.8) 389.7 | 46.9] 47.7 
; $ ! A ‘ 42.9 
SD 19.3 | 17.8} 18.6 | 22.9] 21.7 
a ts 18.4 | 19.0] 23.8 20.8 
’ ~M 43.5] 45.6| 56.0] 54.5] 48.5) 46.8) 50.1) 57.4 50.3 
SD 22.1| 21. ; é j ; : ; 
i aa 1.0} 17.5} 21.7) 24.0] 17.4) 17.1 | 22.9 21.2 
M 8.1 7.0) 12.8) 6.6 4.7) 7A 3.2) 9.6 7.4 
age 13.8 8.8} 12.0} 12.3) 13.6} 12.6) 8.6] 12.6 12.2 
* M 44.0| 42.3] 42.8] 45.7] 45.0] 40.8] 51.8] 43.5 44.5 
SD 26.2| 19.7| 20.4) 18.7| 23.6] 16.8| 16.6] 19.9 20.5 
E Post 
M 42.6| 87.5) 44.4| 44.0| 44.6] 36.0| 47.3 | 46.9 42.9 
a 21.3 | 17.2] 16.1] 19.9] 21.1} 17.2} 16.8] 22.0 19.6 
ain ‘ 
M —1.4 | —4.8 1.6} -1.7| —.4] —4.8] 4.5] 3.5 —1.6 
SD 12.8 8.6 8.4) 8.6] 12.6] 11.4 9.8} 9.3 10.6 


_ These data were interpreted as further evi- 


dence that the teachers varied their behay- 
ior sufficiently to effect a real test of the 
two teaching models. No significant dif- 
ferences were found among teacher pre- to 
Ea gain scores within either treat- 


Summary of Tests of Hypotheses 


Because of the noncomparability of the 
treatment groups on some pretreatment 
Measures, statistical controls were imposed 
in all intertreatment data analyses (except 
analyses of teaching-behavior data dis- 
cussed previously) by use of a two-way 
teacher-by-treatment analysis of covar- 
iance, 

The choice of covariates was determined 
Y an examination of the intercorrelations 


on all measures and variables. On this 


basis, IQ, arithmetic computation, and 
arithmetic problem-solving were used as 
Constant covariates in the analysis of each 
dependent variable, Pretest scores were 
Used as additional covariates in analysis 
of the posttest of each instrument admin- 
istered in both the pre- and posttest series. 
Posttest scores on the Concept Knowledge 
est Were used as an additional covariate 


in the analysis of the Concept Retention 
and Concept Transfer tests. 

Significant F ratios for between-teacher 
effects and Teacher x Treatment interac- 
tion were yielded by the analysis of each 
criterion measure. No completely satis- 
factory explanation could be given for 
these results, although three plausible ex- 
planations are offered below. 

1. Variables of teaching behavior and 
personality too fine to be detected by the 
gross measures of teaching behavior used 
in this study were operative for all teachers. 

2. Certain teacher personalities were 
more compatible with one of the experi- 
mental treatments than with the other, 
thus resulting in Teacher X Treatment in- 
teraction effects. 

3. The variation between schools in the 
number of units completed could also be a 
highly plausible explanation of the signif- 
icant between-teacher F ratio yielded by 
all measures used in this study. 

No further attempt to explain these find- 
ings is given here. Only the results yielded 
by comparisons between Treatments D and 
E are presented. Posttest means and stand- 
ard deviations for each criterion measure 
are shown in Table 4 along with the post- 
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TABLE 4 
Posrrest Mans anD STANDARD DEVIATIONS; 
Posrrust Mzans ApsusTED By COVARIANCE 
anp Sranparp Errors or ApbJUSTED 
Posrrest Mzans 


Measure 


Concept Knowledge Test 


Rete: Test 1 D 48.9 

Concept ntion Be ilage - 

Concept Retention Test 2 D | 48.9 a 
i E 46.1 


| 


Coneept Transfer Test 


4 


| 


55 
ae 


Neetre Concept Transfer 


Semantic Differential Atti- 
tude Scale 


Statement Attitude Scale 


ss/se/se/es|/ss 
BE| EE) SE/SE/55 


Total Attitude Scale 


» |B5lse| eel aisles 


Written Heuristio Transfer 


5 
6 
of 


sone gomputer peogzara Used in the covariance of 
Concept Retention 2 and both transfer tests did 
not give directly adjusted means. 


test means adjusted by covariance and 
standard errors of the adjusted means. 

Initial learning. The data yielded by 
the Concept Knowledge Test did not sup- 
port the hypothesis that Treatment D 
would produce superior results on an initial 
learning test. On the contrary, these data 
showed Treatment E to produce signifi- 
cantly better results (p < .01) than Treat- 
ment D on the initial learning criterion 
test. 

Retention. The hypothesis that Treat- 
ment D would produce superior results 
to Treatment E on a retention test given 
5 weeks and 11 weeks after instruction 
was supported by the evidence yielded by 
an analysis of the Concept Retention Test 
scores (p < .05 for the first administration 
and p < .025 for the second) .7 


"The Concept Knowledge Test represents the 
summation of four discrete subtests, each of which 
was administered immediately upon completion of 
the corresponding subsection of instructional ma- 
terials. This resulted in a series of four 
posttests given approximately 8, 6, 4, and 3 weeks 
prior to the first administration of the Concept Re- 
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Concept transfer. The data yielded by 
the Concept Transfer Test lent slight sup. 
port (p < .08) to the hypothesis that 
pupils in Treatment D would show greater 
ability to transfer the concepts learned 
during instruction than would pupils in 
Treatment E. 

Negative concept transfer. There was 
no support in the data yielded by the 
Negative Concept Transfer Test for the 
hypothesis that Treatment D would pro- 
duce less negative transfer than Treatment 
E. Rather, it was found that there were no 
differences in negative transfer between 
the treatments. 

Attitude. Of the three possible compar- 
isons between Treatments D and E on 
measures of attitude, none reached sig- 
nificance at a minimum acceptable level. 
The hypothesis that Treatment D would 
produce superior results to Treatment E 
on attitude measures was not supported 
by the data. $ 

Transfer of heuristics. The hypothesis 
that Treatment D would produce superior 
results to Treatment E on tests of pupil 
ability to transfer heuristics was supported 
by the evidence yielded by analyses of 
scores from both the Written Heuristic 
Transfer Test (p < .05) and the Oral 
Heuristic Transfer Test (p < .025). 

Table 5 summarizes the analyses of 
covariance which yielded the above results. 


Discussion AND CoNncLUSIONS 


Teaching Behavior 


Of most importance for the inter 
tion of the results of this study was the 
clear-cut evidence that Ss in the two 
Sees evidence” tay ss in 


tention Test. The four subscores were summed, be 
yield a total Concept Knowledge Test score. |” 
average delay between administration of these s! x 
tests and the first administration of the ConceP! 
Retention Test was slightly over 5 weeks. The re 
ond administration of the Concept Retention : 
came 6 weeks after the first. Therefore, the avert 
time between the subtests and the second rete 
test was slightly over 11 weeks. The reader ait 
note that the second administration of the Com ps 
Retention Test was not reported in pre 

ports of this study (Della-Piana, Eldre e 


Worthen, 1965; Worthen, 1965). Both repo! ore the » 


written to meet deadlines which came be! tion 
second administration of the Concept Retent 
Test. Data from this second retention te 
analyzed after the earlier reports had been Pt 


2 
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treatments received instruction by two 
consistently different methods of teaching, 
each of which closely paralleled the par- 
ticular model prescribed. Although the 
necessity of experimental controls may 
have precluded either method from reach- 
ing its optimum power, this factor, if 
present, was equally operative in both 
treatments. 


Test of Hypotheses 


Tn general, the findings of this study 
seem to support many of the claims made 
by proponents of discovery methods. The 
most dramatic finding was the rather 
startling reversal in rank of Treatments 
D and E between the administration of the 
Concept Knowledge posttest and the first 
administration of the Concept Retention 
Test 5 weeks later. Although Treatment E 
was significantly superior to Treatment D 
on the tests of initial learning (p < .01), 
the retention test given after an average 
5-week delay showed Treatment E not 
only to have lost this initial superiority 
but also to have been surpassed by Treat- 
ment D. The Ss taught by the discovery 
method were able to retain significantly 
More material (p < .05) over the interven- 
ing period, notwithstanding the fact that 
they had evidenced knowledge of signifi- 
cantly less material than the Treatment E 
group on the test of initial learning. The 
Second administration of the Concept 
Retention Test 6 weeks after its first ad- 
ministration showed Treatment D to have 
Maintained this superiority over Treatment 
E (p < .025). This finding strongly suggests 
that presentation of mathematical con- 
cepts to sixth-grade pupils through dis- 
Covery sequencing causes the learner to 
integrate the content conceptually in such 
4 manner that he can retain it more readily 
than if the concepts had been presented 
tohim in an expository sequence. 

_ Another finding which clearly favors 
eatment D is that dealing with Ss’ ac- 
quisition of a problem-solving set. In 
light of the evidence yielded by both the 
Written Heuristic Transfer and the Oral 
euristic Transfer tests, it seems reasonable 
0 conclude that learning by discovery 

_ techniques significantly increases pupil 


TABLE 5 
Summary or ANALYsES or COVARIANCE OF 
Crirprion Muasure Posrrest Scorms 
BETWEEN TREATMENTS D anv E 


Measure af | dfs Reais 


Concept Knowledge Test 1 412 | 7.435" D<E 
Concept Retention Test 1 1 412 | 3.918" |D>E 
Concept Retention Test 2 1 412 | 5,868°°* | D> E 
Concept Transfer Test 1 412 | 3.089" >E 
Negative Concept Transfer! 

Test 1 413 +098 
Semantic Differential At- 

titude Scale 1 412 | 161 
Statement Attitude Scale 1 412° | 1,173 
Total Attitude Scale 1 412 | 2.057 
Written Heuristic Transfer 1 413 | 5.004" | D> E 
Oral Heuristic Transfer 1 413 | 5.720°* | D>E 

* p< .08. 
** py < 05. 
*8 p< .025. 

see < 01 


ability to use discovery problem-solving 
approaches in new situations, both those 
which require paper and pencil application 
and those which involve verbal presenta- 
tion by the teacher, Treatment D was 
shown to be significantly superior to Treat- 
ment E on both of these dimensions in the 
present study. 

Treatment D also seems superior to 
Treatment E in terms of transfer, although 
this finding is somewhat tenuous. It was 
the experimenter’s opinion that the Con- 
cept Transfer Test was much too difficult 
for Ss involved and that this factor re- 
duced the possibility of finding more sig- 
nificant differences between the treatments. 
Random errors of measurement, due to 
difficulty and resultant guessing of an- 
swers, could act to increase the analysis of 
covariance error term, thus making it more 
difficult to obtain a significant F ratio. 
The obtained F ratio favored Treatment 
D over Treatment E at a marginal level 
of significance (p < .08), and the ex- 
perimenter would speculate that modifica- 
tions of the instrument to reduce the ran- 
dom error of measurement would result in 
more highly significant differences in favor 
of Treatment D. ie 

The question of relative practicality of 
discovery and expository teaching in terms 
of time consumption has been raised by 
Ausubel (1961, 1964). It should be noted 
that the controls established in this study 
can be used to answer this question with 


reference to fifth- and sixth-grade pupils. 
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Each teacher equalized the length and 
number of daily learning periods for each 
of his two classes. This resulted in some 
variance among schools (due to class 
scheduling), but no variance between 
treatments when summed across schools. 
The results indicate that the discovery 
method need not be more time consuming 
than the expository method of instruction 
at this age level. When given an equal 
amount of time to work on the learning 
task, Ss in Treatment D proved superior to 
Ss taught by Treatment E in the majority 
of intertreatment comparisons. 

Implications for future research. It was 

noted earlier that programmatic research 
dealing with various discovery-expository 
variables of task presentation should be 
initiated. In addition to a continuation of 
research in which sequence characteristics 
of the learning task are manipulated, the 
present research design and instructional 
materials might be modified to explore the 
relative effects of various types and 
amounts of guidance along the discovery- 
expository dimension. Studies could be 
designed in which the present learning task 
could be modified to compare discovery 
methods in which the verbal factor was 
varied from verbal to nonverbal discovery. 
Interrelationships among these relevant 
variables might then be explored. 

Implications for educational practice. 
Any generalizations based on findings of 
this study must take into account the 
particular teachers, experimental sample, 
instructional procedures, instructional ma- 
terials, and criterion measures used, In 
addition, without the programmatic re- 
search suggested above, any conclusions 
drawn on the basis of this single study 
must be tentative at best. 

Conversely, this study was conducted 
under carefully controlled conditions which 
were judged to approximate normal class- 
room conditions with respect to all dimen- 
sions except those specifically varied for 
experimental purposes. Because of the rela- 
tively large time sample, the nature of the 
learning task, and the large number of Ss 
used, it would seem that the results can be 
generalized, at least to innovative teaching 
with similar Ss and subject matter con- 


tent, with a relatively high degree of con. 
fidence. Within this context, it is the ex. 
perimenter’s opinion that, pending further 
programmatic research, this study holds 
the following implications for educational 
practice: 

1. To the extent that pupil ability to 
retain mathematical concepts and_ pupil 
ability to transfer heuristics of problem 
solving are valued outcomes of education, 
discovery sequencing should be an integral 
part of the methodology used in presenting 
mathematics in the elementary classroom. 

2. To the extent that immediate recall is 
a valued outcome of education, expository 
sequencing should be continued as the 
typical instructional practice used in the 
elementary classroom. 
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JAMES HENKELMAN, GERALD P. O'SHAUGHNESSY, 


anpD MILDRED B. COLE* 
University of Maryland 


This study investigated the effects of appropriate and inappropriate 
practice experiences on students’ tendency to search for and find 
shortcut solutions to problems. The strategy of looking for, and skill 
in finding, a shortcut were termed a search set. In the Ist of 3 
experiments, with criterion problems highly similar to the practice 
problems, Ss having search experiences were more inclined to search 
for and find shortcut solutions than those having nonsearch experi- 
ences. In the 2nd and 3rd experiments, with the criterion problems 
more dissimilar from the practice problems, Ss having search ex- 
periences were more inclined to search for a shortcut than those 
having nonsearch experiences, but were notably unsuccessful in 
finding the shortcut solution. Thus, the strategy of search could be 
made to transfer more readily than the skill required to search success- 


fully. 


The purpose of this study was to explore 
the inductive approach to set learning in 
terms of the conditions under which it 
takes place and the conditions under which 
it transfers. In particular the study dealt 
with the search set which included both 
the strategy or inclination to search for an 
“elegant” or shortcut solution to an edu- 
cationally relevant problem and the skull 
or knowledge to carry out the search suc- 
cessfully and discover the appropriate so- 
lution, 

Learning-set formation as identified by 
Harlow (1949, 1959) refers to the phe- 
homenon of transfer of training between 
many problems of a single class. The orga- 
nisms, in Harlow’s case rhesus monkeys, 
Were trained so that they could solve prob- 
lems that were common in “principle” but 
different in specific stimulus content than 
the problems used during training. Such 
_—_-__— 
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learning required that certain error factors 
such as stimulus preference or position 
preference be overcome during training; 
the monkeys had to learn to discriminate 
between cues which led to an incorrect so- 
lution and a class of cues that led to a cor- 
rect solution. 

Gagné and Paradise (1961) adopted 
the term “learning set” to refer to specific 
“knowledges,” that is, principles or con- 
cepts, subordinate to some final learning 
task. In the Gagné and Paradise sense, the 
term learning set referred to learning that 
had taken place and could then be applied 
to further, higher-order, learning, and did 
not refer to the “being set” or “being ori- 
ented” toward a particular problem-soly- 
ing approach as implied in the Harlow 
sense of the term. 

Another track pursued in the study of 
problem solving and concept learning has 
been the strategy approach as used by 
Goodnow (1955), Goodnow and Pettigrew 
(1955), and Bruner, Goodnow, and Austin 
(1956). Goodnow and Pettigrew (1955) de- 
fined a strategy as a consistent way of 
deciding or responding. These authors Te- 
jected the notion of describing their par- 
ticular problem-solving paradigm using 
the concept of strategy by itself. They 
chose to employ the term learning set to 
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apply to the knowledge or learning that 
had taken place (in the Gagné & Para- 
dise, 1961, sense), and the term strategy 
was restricted to consistent use of, or 
sensitization to, search in a particular 
fashion. 

The term search set is offered to include 
both learning set and strategy concepts in 
a problem-solving situation. Individuals 
can be set to approach a problem in a 
particular way but will only be successful 
in this approach if they have acquired the 
appropriate knowledge. Having a search 
set means being set to respond to a mean- 
ingful stimulus configuration in a pre- 
scribed manner and having sufficient 
knowledge to successfully follow this pre- 
scription. It is not inconceivable that the 
skill and strategy components may not be 
present in equal amounts. 


PROBLEM 


The purpose of the present research was 
to attempt to induce a search set in in- 
dividuals through experimentally con- 
trolled prior experience and to determine 
the conditions which would allow the 
search set to transfer. Ideally, a set has 
unlimited transfer possibilities (Gagné, 
Mayor, Garstens, & Paradise, 1962), and 
this experiment tested the accuracy of such 
a statement. 

Goodnow and Pettigrew (1955) demon- 
strated that the learning set-strategy pack- 
age could be induced by controlled prior 
experiences, Individuals who were exposed 
to a 100:0 pattern learned a later 100:0 
pattern or 0:100 pattern more quickly 
following a 50:50 exposure than subjects 
(Ss) who had never experienced the initial 
100:0 pattern. 

Luchins (1942), in a classic series of 
studies, investigated the effect of Einstel- 
lung, a reasonable synonym for set. He 
found that prior exposures to problems 
having a common solution route produced 
extreme rigidity in the continued use of 
that solution route even when other, more 
direct, routes were available. The undesir- 
ability of Hinstellung is not clearly dem- 
onstrated in Luchin’s work although that 
appeared to be his intention. In some cases 


the Einstellung solution was a correct oy 
albeit a longer one to accomplish, Hoy, 
ever, if the alternative to the Hinstellyy 
solution was no solution, then the Ri, 
stellung solution takes on greater valy 
than that attributed to it by Luchins, h 
fact, Luchins showed that increasing the 
time pressure increased the likelihood of 
an Hinstellung solution, implying that 
valued a solution over no solution. 

In this research, the emphasis is the m | 
verse of Luchins’, that is, to induce the 
direct or “elegant” solution route through | 
controlled prior experience in an effort to 
overcome a more obvious or “inelegant! 
approach which was available and hed 
been well-practiced by S. That is, rather 
than attempting to induce rigidity of the 
inelegant or unnecessarily long solution | 
route as Luchins did, the purpose here wat | 
primarily to induce and teach a search set 
that represented a break with a more 
traditional solution route in the direction 
of a more elegant solution. The Luchins 
approach of inducing rigidity through in- 
appropriate experience was included as 
control condition against which the effects 
of appropriate experience could be evalu- 
ated. In keeping with Luchins’ attempt t0 
alter behavior by instruction, two instruc 
tion conditions, one appropriate and on 
inappropriate, were also included in this 
experiment. Finally, an attempt was made 
to see whether experiences appropriate for 
a particular search set would allow ft 
that set to be transferred to situations ia 
which its appropriateness was not 80 0 
vious. | 

The mathematician’s utilization of ti? 
terms “elegant” and “inelegant” solution 
to discriminate between alternative 
rect approaches on the basis of a 
cleverness and speed are appropriate 
scriptions for the search set and its antl 
sis as used in this research. Befitting the be 
of these terms, a mathematical problem a 
the vehicle for studying search-set ™ 
tion. 

Finally, the relevance of the problet 
educational settings lies in the fact tha 
educator is anxious to foster lear oo 
discovery (i.¢., induction) in the class 


since, as the findings of Gagné and Brown 
(1961) and Katona (1940) show, such 
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land. Participation occurred as part of regular un- 
dergraduate mathematics classes. 

The Task 


Jearning provides for greater transfer of 
the concepts and principles to new but re- 
lated problems. Transfer and transferabil- 
ity are clearly goals of the educational 


process. 


Hypotheses 


The following hypotheses were offered: 

1, Individuals having search experiences 
will be more inclined to seek and discover 
a search solution on a problem highly sim- 
ilar to that experienced than those having 
nonsearch experiences. 

9, Individuals given a search instruction 
will be more inclined to seek and discover 
a search solution than those not so in- 
structed. 

3, Individuals having search experiences 

will be more inclined to seek and discover 
a search solution on a problem only 
moderately similar in principle to that 
experienced as compared to individuals 
having nonsearch experiences. 
_ Ibis anticipated that experiences in solv- 
ing problems which have been structured 
80 that a particular class of search solu- 
tions works will lead to learning by dis- 
covery of that class of solutions (this has 
been demonstrated by Gagné & Brown, 
1961) as well as disposing persons toward 
the use of a search strategy (cf. Goodnow 
& Pettigrew, 1955). Such learning and 
strategy adoption as a result of solving 
search problems should show a range of 
transferability to be of practical signifi- 
cance in an educational context. While the 
Use of an instruction or suggestion will not 
provide for set learning, it may be equally 
strategy-inducing as structured experi- 
chees. (Studies such as that of Pepitone & 
Reichling, 1955, have successfully in- 
duced a cohesiveness set via instruction.) 
Induction by instruction should yield more 
Successful search than no induction oF 
Nonsearch induction by instruction. 


Experiment I 
Subjects 


ung ticipants in Experiment I were 262 female 
lergraduate students at the University of Mary- 


The task was a 4 X 6 matrix of two-digit num- 
bers as follows that were to be added: 


13 18 1 128 8 18 
10 10 10 10 10 10 
24 #24 2 24 24 24 
792 79 779 #79 9 7 


Those problems identified as search problems 
could be solved by methods other than adding all 
24 numbers. The example given above represents 
a search problem that could be solved (ie., the 
sum could be found) by adding the numbers in 
the first column and multiplying by 6 since all 
the columns are identical. Other search problems 
featured two columns of numbers repeated three 
times each; others featured a row repeated four 
times although the order of the numbers was differ- 
ent from row to row. The common feature of 
search problems was that a shortcut method could 
be used to obtain a sum rather than adding all 24 
numbers. 

‘A nonsearch problem was a matrix of 24 num- 
bers that could only be summed by adding all 24 
numbers; no pattern existed so that no shortcut 
solution could be used. Search and nonsearch 
problems were equated in pairs in terms of the 
amount of time required to solve each by summing 
all 24 numbers. However, solving search problems 
via the shortcut required considerably less time 
once the shortcut was discovered. 


Procedure 


The Ss were randomly assigned to one of the 
eight conditions shown in Figure 1, The eight con- 
ditions run are described below: 

Cell 1. Ss were given a booklet containing four 
problems, each of which could be solved by a 
shortcut, The first three constituted search prac- 
tice; the fourth was a search criterion problem. 

Cell 2. Ss were given a booklet containing four 
problems, none of which could be solved by a 
shortcut. The first three constituted nonsearch 
practice; the fourth was a nonsearch criterion prob- 


problems. The first three could not be solved by 
a shortcut (same problems as Cell 2) and consti- 
tuted nonsearch practice. I 
could be solved by a shortcut and constituted a 
search criterion problem (same criterion problem 
as Cell 1). si 

Cell 4. Ss were given a booklet containing four 
problems. The first three could be solved by a 
shortcut (same problems as Gell 1) and constituted 
search practice. The fourth problem could not be 
solved by a shortcut and constituted a nonsearch 
criterion problem (same criterion problem as Cell 


2). 
(Thus, Cells 1 and 2 represent prior experience 
appropriate for the criterion problem while Cells 


Appropriate Inappropriate 
prior 


experience 


prior 
experience 


Cell 3 


Cell | 


3 nonsearch 
problems— 


3 search 
problems— 


search 


Search search 
criterion criterion criterion 
problem problem problem 

N = 35 N= 35 

M = 60 M = 85 

SO = 6,0 SD = 6.0 

Cell 2 Cell 4 

3 nonsearch 3 search 

problems— problems— 
Nonsearch nonsearch nonsearch 
criterion criterion criterion 
problem problem problem 


v= 34 


WN = 33 


M = 94 
so = 4.9 


Mm llS 
SD = 5.5 


Appropriate 
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Inappropriate No 
suggestion intervention 


stion 


Cell 7 
suggested 
on 
search search 
criterion criterion 
problem problem 
W = 32 N= 32 
M = 108 MeV 
$0 = 7.0 SO = 9,5 } 


Cell 6 Cell 8 
search 
Suggested 

on 
nonsearch nonsearch 
criterion criterion 
problem problem 
u = 3) N= 30 
Mow 135 Me = 134 
SD = 6.6 SD = 6,3 


Fic. 1. Design of Experiment I and mean times to completion in 
seconds (and standard deviations) for each cell. 


8 and 4 represent prior experience inappropriate 
for the criterion problem.) 

Cell 5. Ss were given a single problem which 
could be solved by a shortcut constituting a search 
criterion problem. Appearing above the problem 
was the statement, “Problems such as this can 
often be solved using a shortcut.” This represents 
appropriate suggestion. 

Cell 6. Ss were given a single problem which 
could not be solved by a shortcut constituting a 
nonsearch criterion problem. At the top of the 
page appeared the same statement as in Cell 5. 
This represents inappropriate suggestion. 

Cell 7. Ss were given a single search criterion 
problem, 

Cell 8. Ss were given a single nonsearch cri- 
terion problem. 

The search criterion problem used in Cells 1, 3, 
5, and 7 and the nonsearch criterion problem used 
in Cells 2, 4, 6, and 8 appear below. 


Search criterion problem ‘Nonsearch criterion problem 


13 28 49 49 64 13 61 59 68 89 32 99 
28 49 64 13 49 64 27 72 68 21 67 63 
49 13 13 64 28 28 78 95 49 58 33 16 
64 64 28 28 13 49 71 9 9 40 34 38 


Identifying Search Behavior 


All Ss timed themselves on all problems and 
recorded their time to completion in seconds on 


the page of the problem. All scratch work was also 
done on this page. (No feedback was given in aly 
instance.) Since search problems could be solv 
in considerably less time than nonsearch problems 
once the shortcut was discovered, the effects 0 
prior experience or suggestion was assessed 
terms of the time to problem completion (ie. Hilt 
taken to obtain a sum of the numbers), Spell 
cally, Hypothesis 1 was tested by comparing meth 
time to solution on the criterion problem for those 
groups having appropriate prior experience fs 
1 and 2) to those groups having inappropul 
prior experience (Cells 3 and 4). It was reas0 
that: t oil 
1, Cell 1 Ss, if affected by the experience, W0, 
be set to search for a shortcut solution, have ai 
in finding it, and subsequently complete p ta 
terion problem in a minimum amount OF 
Z. Cell 2 Ss, if affected by the experience, Wy 
be set not to search for a shortcut sole a 
sum all the numbers in an intermediate am0 
time (the shortcut route takes less time 
straight summing) ; % ould 
3. Cell 3 Ss, if affected by the experiences aa 
be set not to search for a shortcut solu af 
sum all the numbers in an intermediate a™0 
time; 5 ould 
4. Cell 4 Ss, if affected by the experien® 7 oy 
be set to search for a shortcut solution, 8 alt 
finding one since none existed, would en up 
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ing the problem by summing—thus using more 
time than any of the preceding three cells. 
Other comparisons were also made to assess the 


affects of practice and of appropriate and inappro- 
priate suggestion. 


RESULTS 


Means and standard deviations for each 
of the eight cells of the experiment ap- 
pear in Figure 1. The results of an analysis 
of variance of solution times for Ss in the 
first four cells of the experiment indicates 
that Ss having appropriate prior experience 
(Cells 1 and 2) took significantly less time 
to complete the criterion problem than Ss 
having inappropriate prior experience 
(Cells 8 and 4) (F = 15.8, df = 1/133, p < 
01). A significant main effect of criterion 
problem was also obtained with the search 
criterion problem requiring less time to com- 
pletion than the nonsearch criterion prob- 
lem (F = 30.6, df = 1/133, p < .001). 

Cell means were further compared using 
the Duncan multiple-range test (Winer, 
1962) and all mean differences except that 
between Cells 2 and 3 were significant at 
the .01 level. 

The fact that Ss in Cell 1 took less time 
to completion than Ss in Cell 3 indicates 
that the former were both more set and 
more skilled in searching as @ result of 
their prior search experience. The fact that 
8s in Cell 2 took less time to completion 
than Ss in Cell 4 indicates that Ss in Cell 
4 spent some time searching inappropri- 
ately as a result of their prior search ex- 
periences. These findings confirm the first 
hypothesis. 

The data also show that the effects of 
suggestion, either appropriate or inappro- 
priate, were effectively nil. Comparing Cell 
5 (appropriate suggestion) with Cell 7 (no 
intervention) using a t-test, time to com= 
pletion on the search criterion problem was 
no different for the two groups, indicating 
that the suggestion did not significantly 
affect the search set. Comparing Cell 6 
(inappropriate suggestion) to Cell 8 (no 
Intervention) showed that on @ nonsearch 
criterion problem the difference in time to 
completion between the two groups was 
hot significant. Thus, inappropriate SUg- 
gestion did not serve to affect Ss ad- 


versely, This finding negated any necessity 
for completing the two missing cells in the 
design which would have helped to assess 
the effects of suggestion, had there been 
any. 

Two interesting and unexpected findings 
emerged. The comparisons of Cells 1, 2, 3, 
and 4 (all those that had any prior ex- 
perience, whether appropriate or inap- 
propriate) versus Cells 5, 6, 7, and 8 (those 
not having any prior experience) show the 
former to have taken significantly less 
time to complete the criterion problem 
(t = 6.03, df = 264, p < .001). This find- 
ing suggests that there is a general famil- 
iarity or practice effect that enhances the 
performance of Ss on this type of a prob- 
Jem, independent of the set which is es- 
tablished by prior experience. 

A second finding is also worthy of note. 
If the effects of inappropriate prior ex- 
perience had been complete, the time to 
completion taken by Ss in Cell 3 should 
have been identical to that taken for Ss in 
Cell 2. There should not have been any 
differential reaction to the criterion prob- 
lem itself since both cells had prior experi- 
ence with nonsearch problems. However, 
as the data showed, Ss in Cell 3 took less 
time to complete the problem than Ss in 
Cell 2. While this difference did not ap- 
proach significance it is, however, worthy of 
note. The significant main effect of criterion 
problem leads one to conclude that the 
shortcut in the search criterion problem is 
sufficiently visible to be found by some Ss 
having nonseareh experience and would ac- 
count for the Cell 3 effect. Furthermore, 
comparison of Cells 7 and 8, neither of 
which had any experience, indicated that 
the search criterion problem as a stimulus 
array prompted search since Cell 7 Ss took 
significantly less time to solution than Cell 
8 Ss (£ = 1.80, df = 60,7 < 05). 

To determine with more exactness the 
kinds of behavior going on among Ss in 
Cell 3, an attempt was made to analyze 
the protocols of Ss in this cell and com- 
pare them to the protocols of Ss in Cell 1. 
Specifically, an attempt was made to de- 
termine which Ss actually carried out 
search or shortcut solutions as opposed to 
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TABLE 1 
Numpur or Sussecrs 1n Czus 1 anp 3 WHO 
Arrainsp Search, Emurgent SEARCH, AND 
Nonszarcu Sonvtions To THE SEARCH 
Crirerton Propuem (Expertment I) 


Solution Cell 1 Cell 3 Total 
Search 24 9 33 
Emergent 7 15 22 
Nonsearch 4 ll 15 
Total 35 35 70 


See PON ene Ene 


those who did not carry out such solutions. 
A third group was also identified in which 
Ss appeared not to begin with search solu- 
tions but to end up with search solutions, 
such solutions having emerged from their 
work on the search criterion problem. This 
analysis of protocols could not be done on 
Cells 2 and 4 since these two cells used a 
nonsearch criterion problem for which no 
search solution was possible. The purpose 
of the comparison between the cells was to 
determine the extent to which inappro- 
priate prior experiences for Cell 3 Ss had a 


Appropriate 
prior 
experience 


Cell la 
Search 3 search problems— 
transfer search transfer 
criterion criterion problem 
problem 

N= 50 

M=107 SD= 6.9 

Cell 2a 
Nonsearch 3 nonsearch problems— 
criterion nonsearch 
problem criterion problem 


Was) 


M=99 SD= 5.4 
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complete versus a partial effect in terms of 
causing them to seek nonsearch solutions 
the search criterion problem. This bres, | 
down by type of solution utilized appea, 
in Table 1. As the table shows, Ss in (yj 
1 used significantly more search solution 
than Ss in Cell 3 (x? = 10.56, dj =) 
p < .01), thus supporting the time dats, 
However, it can be seen that the effects of 
inappropriate prior experience on sear 
criterion problem solution was incomple 
since some of the Cell 3 Ss immediately 
used the search solution while others dis 
covered a search solution in an emergat 
fashion. From this one must conclude again 
that the search criterion problem itself, # 
a stimulus configuration, influenced th 
behavior of Ss above and beyond the ¢: 
fects of prior experience provided in th 
experiment. 


Experment IT 


Procedure 


Experiment II was identical in procedure to the 
first four cells of Experiment I except that differet 
criterion problems were used in the second 


Inappropriate 
prior 
experience 


Cell 3a 
3 nonsearch problems— 
search transfer 
criterion problem 
N = 52 


M=97 SD= 5.2 


Cell 4a 


3 search problems— 
nonsearch 
criterion problem 


N = 47 


M=122 S0= 4.9 


Fig. 2. Design of Experiment II and mean times to completion in 
seconds (and standard deviations) for each cell. 


| 
| 
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periment. Cells 5, 6, 7, and 8 of the first experi- 
ment were dropped due to the limited results and 
notable practice effect obtained in the first ex- 
periment. The criterion problems were altered to 
test the third hypothesis, dealing with the trans- 
ferability of search sets. Specifically, the new 
search criterion problem required the use of a 
shortcut rule which had not been encountered in 
the three prior experience problems and thus re- 
quired transfer. A new nonsearch criterion problem 
was developed and equated in time to solution by 
adding with the search criterion problem. Both 
criterion problems appear below. 

Search criterion{problem Nonsearch criterion problem 
13 28 49 49 64 13 61 59 68 89 32 99 
98 49 64 13 49 64 27 72 68 21 67 63 
18 28 64 28 49 13 78 95 49 58 33 16 
49 64 28 64 13 28 71 9 9 40 34 38 


The search criterion problem could be solved 
by using the following rule: Multiply each of the 
four numbers that appears (13, 28, 49, and 64) by 
6, which represents the number of times each ap- 
pears, and then sum the four products. 

Participants in the second experiment were 200 
undergraduate girls using the same procedure as 
was employed in the first experiment. These Ss 
came from the same total population as those in 
Experiment I. As in Experiment I, Ss were ran- 
domly assigned to conditions. 

The four conditions employed in Experiment bas 
appear in Figure 2. Briefly, these are: (a) prior 
search experience—search transfer criterion prob- 
lem; (b) prior nonsearch experience—nonsearch 
criterion problem; (c) prior nonse experi- 
ence—search transfer criterion problem; (d) prior 
search experience—nonsearch criterion problem. 

Again, time to solution and analysis of solutions 
were used as the criterion measures. 


Resvuits 


Means and standard deviations for each 
of the four cells of Experiment II appear 
in Figure 2. The results of an analysis of 
variance of solution times shows that the 
effects of neither appropriate versus inap- 
propriate experience nor search versus non- 
Search criterion problem was significant 
but that the interaction of these two vari- 
ables was significant (F = 12.2, df = 1/196, 
P < 01). Inappropriate practice appeared 
to facilitate performance on the search 
criterion problem while it adversely af- 
fected performance on the nonsearch cri- 
terion problem. It would seem that prac- 
ticing the nonsearch approach was more 
effective a prerequisite to completing the 
Search transfer criterion problem than was 
Practicing the search approach. This find- 
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TABLE 2 
Noumszr or Supsects 1n Cutts la anp 3a Wao 
Arratnep SEARCH AND Nonszarcu SoLurtons 
vo THE Search CRITERION PROBLEM AND 
Mean True To Sonvurion ror Hacn 
Suserovp (Experiment II) 


Cell ta 
Solution 
Sub- 
jects 
Search 14 
Nonsearch 36 
Total 50 


ing is opposite to that in the first experi- 
ment and leads to the rejection of the sec- 
ond hypothesis. 

In order to shed more light on this turn 
of events, protocols were examined and 
frequencies of search solutions and non- 
search solutions for Ss in Cells 1a and 3a 
were identified? These data appear in 
Table 2. Apparently, relatively few Ss in 
either cell attained a search solution to 
solve the problem. However, the number of 
Ss in Cell 1a attaining a search solution 
significantly exceeded that of Cell 3a (x? = 
7.64, df = 1, p < .01). 

Apparently, many Ss in Cell la at- 
tempted to discover a search solution but 
failed, and reverted to the nonsearch ap- 
proach, thus inflating the time data for 
that cell. Furthermore, if the search cri- 
terion problem is considered to be a non- 
search problem, since many Ss reacted to 
it as such, then Cells 2a and 3a represent 
appropriate prior experience with Cells 1a 
and 4a inappropriate prior experience. Of 
note, the former two cells took less time 
to solution than the latter two. Supporting 
evidence for these conjectures 18 ob- 
tained by comparing the times of those 
attaining search solutions in Cells 1a and 
3a and comparing the times of those at- 
taining nonsearch solutions. Since only 3 
Ss in Cell 3a attained search solutions, this 


2 The use of the category “emergent search solu- 
tions” was not necessary in this experiment since 
the use of a search criterion problem requiring 
transfer reduced the number of such emergent 
solutions to zero (as far as one could tell from the 


protocols). 
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data was not broken down. The 14 Ss who 
attained search solutions in Cell 1a at- 
tained ’a solution significantly more quickly 
than Ss in Cell 3a (79 seconds as com- 
pared to 97 seconds, t = 2.18, p < .05). 
Cell 1a Ss who attained a nonsearch solu- 
tion took significantly more time than Cell 
3a Ss (118 seconds > 97 seconds, ¢ = 2.59, 
p < 02). This latter comparison supports 
the inference that some Cell 1a Ss were set 
to look for a shortcut and looked for one; 
being unsuccessful they resorted to a non- 
search solution, The total of searching and 
adding required more time than simply 
adding (Cell 3a). 

Furthermore, Cell 1a Ss who attained a 
nonsearch solution took about the same 
amount of time to solution as did Cell 4a 
Ss—the group which had search experience 
and a nonsearch criterion problem, further 
substantiating the inference that the for- 
mer searched but failed to find the short- 
cut. On this basis, the search set did not 
appear to transfer. 


Experment III 


Rationale and Procedure 


Experiment III was undertaken on the assump- 
tion that some Ss in Cell 1a in Experiment II were 
capable of search under transfer conditions but 
had abandoned search to save time, after attempt- 
ing to discover a search solution on the search 
transfer criterion problem and not immediately 
finding one. In order to discover if, in fact, prior 
search experience led to a transferable search set, 
it was deemed necessary to increase the value and 
efficiency of the search solution. To accomplish 
this, Cells 1a and 3a of Experiment II were re- 
peated using a 12 X 12 matrix of numbers as a 
criterion problem (rather than a 4 X 6), with the 
same shortcut rule as in Experiment II applying. 
A nonsearch criterion problem of comparable size 
was also generated. Since the nonsearch solution 
time was so greatly increased in this larger prob- 
lem, the efficiency of the search solution was in- 
creased accordingly. 

' Forty-eight different Ss from the same popula- 
tion as in the previous two experiments partici- 
pated in Experiment III. Procedures in the third 
experiment were the same as in the previous one 
for Cells la and 3a (see Figure 2). The same prior 
experience problems were used in Experiment III 
as in the previous two experiments; only the cri- 
terion problems differed. 


REsvnts 


The Ss in the two cells of Experiment 
III were found not to differ in time to 
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TABLE 3 


Nomsper or Sunszcts 1n CELis 1a’ anv 3a! Wi 
ArratneD Searcu and NonsEarca Soxumoys 
To THE SearcH CRITERION PROBLEM Anp 
Mzan Time To Sonvrion 1n Sxconps ror 
Eacu Suscrour (Exprriment II) 
————————————S————_—_—_—_—_—_— 


Cell 1a’ Cell 3a’ 

Solution ‘Total 
Sub- Sub- subjects 
jects jects 

Search 11 | 215 2 1B 
Nonsearch 4 432 | 21 | 360 35 
Total 25 23 48 


solution (339 seconds and 340 seconds), 
A comparison of frequency of Ss attain. 
ing search solutions to those attaining 
nonsearch solutions appears in Table 3, 
As the table shows, significantly more Cell 
la’ Ss attained search solutions than did 
Cell 3a’ Ss (x? = 6.80, df = 1, p < Ol), 
The Ss in Cell 1a’ attaining a search solu- 
tion took significantly less time to solution 
than Cell 3a’ Ss (t = 3.36, df = 82, p< 
01), while Cell 3a’ Ss took significantly 
less time to solution than Ss in Cell 1d 
who attained a nonsearch solution (t= 
1,90, df = 35, p < .05).® 

This latter finding leads to the concli 
sion, as in Experiment II, that being st 
to search (ie., looking for a shortcut) 
transfers while the use of the search st 
(i.e., attainment of a search solution) only 
partially does so. In other words, the 
strategy transfers to a greater extent than 
the skill. This inference gains greatét 
support from the data than does an & 
planation in terms of the relative efficien” 
cies of search and nonsearch solutions. 


Discussion 


From the findings of the three expe 
ments, it was possible to make an 7 
ference that was both important and “f 
expected, namely that the strategy 


*Tywo Ss in Cell 3a’ attained a search ene 
to the criterion problem (see Table on ait 8 
two Ss completed the criterion problem ans thes? 
much time as the mean for their cell. Ha 1a, 
two Ss been removed from the Cell 38 a af 
difference between the times of nonseate at nol 
Cell 1a’ and the remaining Ss in Cell 32 ( kel 
searchers) would have been even more ™ 
than the difference shown. 
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search could be made more readily to 
transfer than the skill of search, as the 
result of limited prior experience. 

In the initial formulation, search strat- 
egy and search skill were incorporated 
into the single concept of search set with 
the expectation that such sets could be 
induced by providing appropriate ex- 
periences. This expectation appeared to 
hold in the first experiment where the 
search criterion problem and search ex- 
perience problems were quite similar (and 
furthermore, the shortcut in the search 
criterion problem was apparently “easy” 
to find). Analysis of protocols showed 
that the situation was not one where Ss 
either looked for or did not look for a 
shortcut.. Among Ss receiving nonsearch 
experience, a shortcut solution to the 
search criterion problem was apparently 
“stumbled upon” (ie. emerged) in the 
majority of cases. This, coupled with the 
significant main effect of criterion prob- 
lem, indicates that the ease of finding the 
shortcut (a problem characteristic) was an 
important determinant of the solution 
strategy adopted by S, even when contrary 
to the practice sequences provided by the 
experimenters. (This was an unintended 
outcome.) However, the greatest influence 
on strategy adoption was the joint effect 
of search experience and the visibility of 
‘a shortcut in the search criterion prob- 
lem, 

The visibility of the shortcut was 
drastically reduced in the criterion prob- 
lem used in the second experiment. This is 
indicated in part by the absence of a main 
effect of criterion problem in the second 
experiment and the fact that only 28% of 
8s exposed to search experience and 6% 
of Ss not so exposed recognized the short- 
cut solution (as opposed to 89% and 
69%, respectively, in the first experiment). 
Thus, the experience effect and problem 
effect are separated in the second experi- 
ment by virtue of the elimination of the 
latter, 

The experience effect in the second ex- 
Periment is clearly not the simple one 
predicted in the second hypothesis. The 
search experience sequence appeared to 
have both enhancing and debilitating ef- 


fects on performance by Ss on the search 
transfer criterion problem. The nonsearch 
experience sequence on the other hand had 
no differential effect on search criterion 
problem versus nonsearch criterion prob- 
lem performance (compare Cells 2a and 
3a). The result was a significant inter- 
action between prior experience and cri- 
terion problem and prompted internal 
analyses. By splitting the group that re- 
ceived search experiences prior to the 
search criterion problem into those who 
attained a search solution and those who 
did not, it was possible to pinpoint the 
differential effect of search experience on 
strategy and skill. Specifically, Ss at- 
taining a nonsearch solution on the search 
criterion after search experience as com- 
pared to after nonsearch experience re- 
quired more time to solution. Apparently 
such Ss spent time looking for a shortcut 
as a result of their search experiences but 
had not had sufficient experience to de- 
velop the level of skill required to find a 
shortcut, Eventually they “gave up” and 
reverted to a nonsearch solution. Findings 
of the third experiment supported this in- 
terpretation. 

The consistent findings of this study 
lead to the conclusion that search skill 
has quite limited transfer possibilities, 
certainly as compared to search strategy 
(ie, searching as compared to finding) , 
and stimulates the recommendation that 
the conceptual distinction between these 
two phenomena be retained, and that they 
be labeled differently. (The initial position 
in this paper was to combine them, which 
does not now appear to be conceptually 
sound.)* It would seem reasonable to re- 
tain the term “search set” to refer to the 
strategy of search and use some other 
term, perhaps “learning set,” to refer to 
the skill in finding a shortcut solution. 
A specific experience or set of experiences 
does not appear to affect both the strategy 
of search and the skill involved in a par- 
ticular kind of searching to the same 


“Jt must be emphasized that the main finding 
of this experiment was serendipitous insofar as the 
experimenters let the data lead them to the un- 
avoidable conclusions. When breaking new ground, 
this appears to be a useful research strategy. 


68 


degree. Thus, these two phenomena are 
different in a practical as well as a con- 
ceptual sense. 

One implication of this finding is that 
limited educational exposure to elegant 
thinking and problem-solving approaches 
may induce students to adopt the strategy 
to search when confronted by transfer 
situations but leave them lacking the skill 
to successfully apply this strategy. The 
result would be performance inferior to 
the inelegant technique, and perhaps 
frustration. Based on this implication, one 
must take care to provide a level of skill 
commensurate with a student’s commit- 
ment to a strategy in order that he can 
use this strategy effectively, if at all. 
More extensive experience sequences than 
those provided in this experiment would 
be needed. 

Tt was concluded from the first experi- 
ment that suggestion has relatively little 
effect on performance. Certainly suggestion 
provides S with no environmental evi- 
dence for the efficacy of the strategy. 
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DELAYED INFORMATION FEEDBACK, FEEDBACK CUES, 


RETENTION SET, 


AND DELAYED RETENTION 
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8 groups of about 20 undergraduates each were presented with 60 


factual multiple-choice items, 
ceived feedback immediately or 


cluded the stem and 4 alternatives 


only the 4 alternatives. Just prior 


retention set and 4 groups 
was 2 (feedback) X 2 (stem, 
immediate and 5-day delayed 


the delayed retention test 


stem of the question, and the retention set 


than their counterparts. 


answered each question, and either re- 
24 hr. later. For 4 groups feedback in- 
to each question. 4 groups received 
to feedback 4 groups were given a 
were not, The design of the experiment 
no stem) X 2 (set, no set). A 60-item 
retention test was administered. On im- 
mediate retention only the effect of stem, no 
the groups receiving 


stem was significant, On 
delayed feedback, the 
performed reliably higher 


The idea that learning is improved when 
4 reinforcer or some information feedback 
(IF) promptly follows one’s behavior is 
found in several prominent theories of 
learning (Hull, 1952; Skinner, 1957; 
Spence, 1956). Most textbooks in psychol- 
ogy and educational psychology mention 
the superiority of immediate information 
feedback (IIF) as a principle of human 
learning and point out its presumed im- 
portance in educational practice. The ad- 
vent of teaching machines has increased 
the acceptance of this principle among 
educators and most psychologists. How- 
ever, the application of IF is not limited in 
education to teaching machines. Thus, tell- 
ing a student his answer is “right,” or 
quickly handing back examinations with 
corrections and comments, are examples 
of IF, Sassenrath and Garverick (1965) 
have recently found that different types of 
IF on midterm examinations, 2 days after 
taking the examination, have different ef- 
fects on retention and transfer of learning. 
Evidence for the effects of reinforce- 
ment or IF on learning comes from three 
general kinds of studies: (a) animal ex- 
Periments; (b) experiments of human mo- 
tor skills; and (c) experiments employing 
verbal materials. In the area of anima 
learning, Renner (1964) reviewed the liter- 
ature on delay of reinforcement. The evi- 
dence indicates that learning efficiency 
decreases the longer the delay of reinforce- 
ment, On this point there is certainly @ big 
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question regarding the logic of inferring 
from a rat learning to press a bar or a 
pigeon learning to peck at a disk to a 
child learning to speak his native language 
or an adult learning matrix algebra. The 
implication is that the principle of rein- 
forcement may not have the generality it is 
commonly assumed to have. Furthermore, 
the learning of factual multiple-choice 
materials, which will be reported in the 
present study, may not have much to do 
with learning a language or mathematics 
either. 

Evidence from studies on human motor 
skills calls into question the superiority of 
JIF on learning. Of 14 studies dealing 
with this problem, 11 show no significant 
difference in learning efficiency, one favors 
JIF, and two favor delayed information 
feedback (DIF) (Brackbill, Wagner, | & 
Wilson, 1964). In experiments dealing 
with verbal, concept, or discrimination 
learning materials contradictory results 
have been obtained. Saltzman (1951) and 
Bourne (1957) report learning impairment 
associated with increased DIF. On _ the 
other hand, Brackbill and associates 
(Brackbill, Boblitt, Davlin, & Wagner, 
1963; Brackbill, Bravos, & Starr, 1962; 
Brackbill, Isaacs, & Smelkinson, 1962; 
Brackbill & Kappy, 1962; Brackbill, Wag- 
ner, & Wilson, 1964) have generally found 
no difference in rate of learning between 
TIF and DIF. These latter findings are im- 
portant since most of the previous studies 
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with humans and animals, and most the 
theories of learning, have taken the position 
that DIF or delayed reinforcement either 
had no effect or impaired rate of learning. 

Equally important for psychological 
theory are the findings on children by 
Brackbill and on adults by Sturgis (1966) 
and Sturgis and Crawford (1963) that 
DIF produces higher retention scores than 
does IIF under some conditions. For educa- 
tion these results also are important. Why 
teach or learn something if it is not re- 
membered after some brief period of time? 
One goal of education is not just to have 
students learn but to have them remem- 
ber what they have learned so that they 
may use or transfer what they remember 
to other learning situations or so they can 
use their knowledges and skills to solve 
problems. 

One of the possible theoretical explana- 
tions for the beneficial effect of DIF on re- 
tention is the hypothesis that during the 
delay interval the learner can use response- 
produced, external, or verbal cues to help 
mediate the DIF period (Brackbill & 
Kappy, 1962), With verbal materials and 
highly verbal older children or college stu- 
dents, it would appear that the use of ver- 
bal cues might be extremely important in 
regulating one’s behavior with respect to 
memory of past events and the anticipa- 
tion of future events. This may be one rea- 
son why delayed reinforcement with ani- 
mals impairs learning (Renner, 1964), 
whereas with children and adults DIF did 
not impair learning and did facilitate re- 
tention. 

In addition, in animal studies of de- 
layed reinforcement, the animal has to 
remember his response over the delay inter- 
val and is then given only the reinforce- 
ment without being again presented with 
the original task and alternatives. In the 
studies with humans, the subject (S) also 
has to remember his response over the de- 
lay period but during IF he is usually pre- 
sented with the question or task as well as 
the alternatives he should or should not 
have made. Thus, the partial or complete 
re-presentation of the initial task (or ques- 
tion and alternative) appears to be an im- 
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portant issue, and one which will be i, 
quired into in the present study. 

Related to the notion that external » 
verbal cues between response and IF facil. 
tates retention is the fact that a set tor. 
member (instructional cues) when given jy 
students before learning (feedback) 
hances retention (Karen, 1956). On th 
other hand, a set to remember given stu 
dents after learning (feedback) does no 
enhance retention (Ausubel, Schpoont, & 
Cukier, 1957). Perhaps this is because the 
way the material is learned and remem 
bered under a retention set is different than 
under no retention set—even though the 
same amount of material may be learned. 

In her research with college students, 
Sturgis (1966) found that DIF producel 
superior delayed retention without infu 
encing immediate retention, but only wher 
students were given complete feedback 
cues. However, in her study and in a pie 
vious one (Sturgis & Crawford, 1063) 
Sturgis confounded IIF per question wi 
24-hour DIF per list. Therefore, the pur 
pose of the present experiment was to cor 
rect this methodological difficulty and to 
ascertain the effect on immediate and dé 
layed retention of (a) DIF, (b) the 
amount of IF cues, and (c) a set to Ie 
member. 


Mernop 


Materials and Procedure 


Sixty, four-alternative, factual, raultiple-chet 
questions generally found in courses in ba 
ductory psychology constituted the learning bes 
retention materials. Thirty-eight of thes? uaa 
tions were used by Sturgis (1966). Each que" 
was mimeographed on an 85 X 3-inch piece 
paper and the 60 questions were stapled oP td 
into a packet. Each student was given # ne 
and an IBM answer sheet for the ina sit 
tation of the list of questions. Student 0 
given 15 seconds to read and mark an sal pres 
each of the 60 questions. After the initt # 8 
tation of the list, the packet and the bast bid wis 
were collected. The group which receive! estas 
immediately given another packet of 60 vi 
Half of the students received a packet alter 
stem cues of the question and the ieee ‘The 
tives with the correct alternative. under 
other half of the IIF group received & Wabe ot 
the 60 items with the four alternatives 4 no ste 
rect alternatives underlined, but there 9 
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cue presented. In both cases students were given 
40 seconds to read each item during the UF 
period. About half of the students receiving stem 
cues and no stem cues were told just before the 
TIF that they should try hard to learn and re- 
member the correct answer to each question since 
there would be a retention test later. This is 
called a retention set. The other half of the 
students received no retention set. 

Half of the students who took the initial pres- 
entation of the 60 questions received 24-hour 
DIF. As was done with the IIF students, half of 
the DIF students received the stem cues of each 
question and the four alternatives with the correct 
one underlined. The other half did not see the 
stem cues. Both groups received a 10-second IF 
exposure on each item. Again, as was done with 
the IF students, half the students under the DIF 
condition that received the stem and no stem were 
given the retention set immediately before the 
DIF. The other half had no retention set. 

Immediately after completing either the IIF 
or the DIF, the packet of materials was collected 
and a mimeographed retention test of the 60 
multiple-choice questions, and an IBM answer 
sheet was passed out to each student. The ques- 
tions on the immediate retention test were in 
a different order than on the initial presentation. 
Students could answer the questions on the re- 
tention test at their own rate, and most students 
finished in about 10 minutes. This is called the 
immediate retention test. Five days later, students 
were administered the same retention test and 
worked through it at their own rate. The questions 
were arranged in a different order again. Most 
students finished this delayed retention test in 
about 12 minutes. 


Subjects and Design 


_ The students were 163 upperclassmen enrolled 
in four sections in @ course in introduction to edu- 
cational psychology. The experiment was & 2 


(IIF versus DIF) X 2 (stem cues versus no stem 
cues) X 2 (retention set versus no retention set) 
design with about 20 students in each of the eight 
groups. The experimental treatments were as- 
signed to the sections at random. Conflicts in class 
schedules precluded assigning students to sections 
at random. 


RESULTS 


Table 1 presents the mean scores on 
the initial, immediate retention, and de- 
layed retention tests for students in each 
of the three experimental conditions. A 
2x 2x 2 analysis of variance (with a cor- 
rection for unequal Ss in each group) of 
the initial test scores indicates that there 
was no significant difference among the 
groups due to IF, feedback cues, or reten- 
tion set. In addition, none of the interac- 
tions was significant. Thus, the students in 
the subsequent experimental conditions 
were on the average about equal in terms of 
their scores on the initial test. 

‘A three-way analysis of variance (with 
correction for unequal N) of the immedi- 
ate retention test showed that there were 
no significant differences due to IF or re- 
tention set, but the difference due to feed- 
back cues was significant, 7 = 38.53, df = 
1/156, p < .001. Apparently, whether IF is 
immediate or delayed 24 hours after the 
initial presentation has no immediate effect 
upon retention. Also being told to remember 
what one is about to learn has no effect 
on immediate retention. However, when IF 
included the stem of the question there was 
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a greater effect upon immediate retention 
than when IF did not include the stem of 
the question. Notice, however, that Ss who 
received no stem cues still learned a great 
deal as measured from their initial per- 
formance. For that matter, notice that the 
increase in scores from the initial to the 
immediate retention test due to learning is 
about 100%. None of the interactions on 
the immediate retention test was signifi- 
cant. 

On the delayed retention test, the anal- 
ysis of variance (with correction for un- 
equal N) indicated that the differences 
were significant for IF, F = 5.06, df = 
1/156, p < .05, for retention set, 7 = 7.69, 
df = 1/156, p < .01, and for feedback 
cues, F = 45.94, df = 1/156, p < .001. 
None of the interactions was significant. 
Thus, as can be seen in Table 1, Ss who 
received DIF performed reliably higher on 
delayed retention than Ss who received 
TIF. Also Ss who received stem cues dur- 
ing feedback did better than Ss who re- 
ceived no stem cues. And Ss who received 
a retention set performed higher on de- 
layed retention than Ss who did not re- 
ceive a retention set. 


Discussion 


The results of the present study indi- 
cate that DIF and IIF during learning 
produce no differential effect on a reten- 
tion test administered immediately after 
learning. On the other hand, 5 days later 
on the delayed retention test, there is a 
small but significant difference favoring 
Ss who received DIF. Both Sturgis (1966) 
and Sturgis and Crawford (1963), also 
working with college students and em- 
ploying similar materials, haye found 
similar results. This was true in spite of 
the fact that Sturgis confounded IIF per 
item with DIF per list of items. The only 
difference was that Sturgis (1966) found 
an interaction in that DIF was superior 
only when Ss received the right and wrong 
alternatives along with the stem of the 
question during IF. The Ss in the study 
by Sturgis that received DIF with the 
question and only the correct alternative 
did not perform reliably higher than Ss 
that received IIF. In their studies with 


third graders, Brackbill and Associate, 
(Brackbill, Boblitt, Davlin, & YW, 
1963; Brackbill, Bravos, & Starr, 19, 
Brackbill, Isaacs, & Smelkinson, 199) 
Brackbill & Kappy, 1962; Brackbill, Wo 
ner, & Wilson, 1964) also have gi 
found that Ss receiving DIF, even of only 
10 seconds, performed higher on relearning 
than Ss that received IIF, and also ge. 
erally found that DIF did not impair inj. 
tial learning. Thus, there is mounting ey. 
dence that DIF does not retard Jeaminy ° 
and may enhance delayed retention, If, | 
these results have considerable implies 
tions for learning theory, programmed in- 
struction, and classroom teaching. 

How does one explain the effect of DIF 
on retention? One hypothesis is that with 
DIF Ss can make use of more cues from 
the initial task or question before DIF is 
presented (Brackbill & Kappy, 198); 
Renner, 1964). For human Ss with a ver 
bal repertoire, these cues are largely verbl 
and assist Ss in mediating the DIF peril 
and/or the period between learning and it: | 
tention. However, if this hypothesis wett 
tenable, why is it that in the present 
periment and in the studies by Sturgis 
the effect of DIF was not found on the 
immediate retention test but only ap) 
on the delayed retention test? In othe 
words, whatever brings about the effect of 
DIF on delayed retention does not 000i! 
solely before, during, or immediate 
after feedback (learning). Otherwise, 
effect should occur on the immediate 1 
tention test. It would appear that or 
might be some interaction between W) 
happens during the DIF process and oe 
terval prior to or during the adminis i 
of the delayed retention test. It could 
speculated that during the DIF peril 
when a verbal S responds initially sal 
question, certain response-produced we 
cues may be implicitly or covertly t 
hearsed. This initial rehearsal may “4 
have any beneficial effect on imm 
tention, but may bring about 
covert or overt verbal rehearsal f a 
question during the delayed ri ‘ 
terval or at the time of the delayee gil 
tion test. Thus, the verbal cues pe | 


secon! 
of the 


feedback process may influence the 
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verbal cues during the retention process. 
An experiment is now underway to deter- 
mine if various kinds of cues presented 
during DIF or IIF have any effect on im- 
mediate and delayed retention. 

Ausubel, Schpoont, and Cukier (1957) 
argue that set to remember does not pro- 
duce superior retention per se but rather 
results in superior learning which, in turn, 
produces greater retention. The results of 
the present study, however, indicate that a 
set to remember produces superior results 
on the delayed retention test as compared 
with a no-set condition, but no difference 
on the immediate retention test. That is, 
there is no evidence for differential learn- 
ing even though differential retention was 
evident. This would suggest that the mean- 
ing of the task, rather than degree of 
learning, was modified by set instructions. 

In the present study it was also found 
that Ss receiving cues during IF, which in- 
cluded the stem of the question and the 
alternatives, performed reliably higher on 
both the immediate and delayed retention 
tests than did Ss receiving no stem cues but 
only the alternatives, In other words, S had 
to remember his response over the period 
prior to DIF but was then given either 
full or partial IF which included the 
question or task as well as the alternatives 
they should and should not have made. In 
animal studies of delay reinforcement S 
also has to remember his response over the 
delay interval but is then only given 
reinforcement without also being presented 
with the initial task and alternatives. Un- 
der these conditions animals receiving 4 
long (over 10 seconds) delay in reinforce- 
ment show little if any learning or reten- 
tion. On the other hand, human Ss under 
24-hour DIF condition who received only 
the alternatives without the task or ques- 
tion still show considerable learning and 
retention. However, they do not show as 
much retention as Ss presented with the 
task and alternatives at DIF. 
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BEHAVIORAL CORRELATES OF ACADEMIC ACHIEVEMENT 
II. PURSUIT OF INDIVIDUAL VERSUS GROUP GOALS IN A 
DECISION-MAKING TASK’ 


ROBERT 8S. WYER, JR? 
University of Illinois, Chicago Circle 


College freshmen, representatives of 4 combinations of academic 
aptitude (college entrance examination score) and academic perform- 
ance (1st-term grade-point average), participated in 2-person groups 
in a decision-making task in which their choices either would increase 
the likelihood of individual goal attainment at the expense of group 
goal attainment, or would increase the likelihood of attaining group 
goals at the sacrifice of individual goals. As expected, there was a 
general tendency to increase the frequency of group-oriented choices 
when group goals were greatly affected. When choices had relatively 
little effect upon the likelihood of attaining group goals, the frequency 
of group-oriented choices (choices that would increase the likelihood 
of group goal attainment) was related positively to academic perform- 
ance among both males and females. When decisions had relatively 
great effect upon group goals, this relationship occurred only among 
Ss of high academic aptitude. Results raised questions concerning the 
general validity of the assumption that socially oriented behavior 
tendencies are detrimental to academic effectiveness. 


Research on the motivational correlates 
of academic achievement (cf. Pierce & 
Bowman, 1965; Todd, Terrell, & Frank, 
1962) has emphasized two factors: The 
desire for personal achievement and the 
desire for effective social relations. Wyer 
and Terrell (1965) found significant sex 
differences in the relationships between 
academic performance and the acknowl- 
edged desire for recognition in both aca- 
demic and social areas. In _ general, 
however, studies of motivational factors as- 
sociated with academic effectiveness have 
yielded fairly inconclusive results (Lavin, 
1965, pp. 74-79). 

The effect of social motivation upon 
academic achievement has been especially 
unclear. It is often assumed that a social 
orientation is detrimental to the pursuit 
of academic goals. This presupposes an in- 
compatibility between socially directed be- 
havior and academic goal attainment. 
However, the willingness to cooperate with 
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other persons in achievement-related ac 
tivity may often lead to more effective 
goal secking; if manifested in academic 
areas (eg., through participation in i 
formal group discussions, giving and Te 
ceiving assistance in problem solving, ete.), 
it may increase academic effectiveness. The 
study reported here, one of a series 0 
studies of nonintellective factors associated 
with academic success in college (Wyeh 
1965, 1967; Wyer & Terrell, 1965; 
Wyer, Weatherley, & Terrell, 1965), wa 
concerned with this issue. A situation we 
constructed in which subjects (Ss) we? 
faced with a decision either to seek goals 
independently or to seek them in cooper 
tion with other persons. The tendency # 
respond cooperatively in this situation 
was analyzed as a function of acadenle 
aptitude and performance. These analyst 
were expected to provide information 
the facilitative or detrimental effects a 
socially oriented behavioral tendend 
upon academic achievement. 
Representatives of four ¢ on 
aptitude (college entrance exa Te 
score) and performance (grade-point 4 Fi 
age) were divided into two-person 8 A 
and asked to participate in 4 decist 


ombinations 
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making task. They were told that prizes 
would be given both for individual per- 
formance and for team performance. The 
task was similar to that described by 
Deutsch (1962). That is, S was given two 
choices (Xi and Xz) and was told that his 
partner would also have two choices (Y:1 
and Y»). The outcome in terms of the 
number of points received was ostensibly 
determined by the conjunction of the two 
players’ choices. As an example, consider 
the following matrix: 


Player Y 
Yi Y2 
Xa 7 Tie ad 
Player X X 8.2 3,5 


For each combination of choices the first 
number indicates the outcome for X and 
the second number the outcome for Y. For 
instance, if X2 and Y2 were selected, X 
would receive 2 points and Y would re- 
ceive 8 points; the number of points that 
players would receive as a team is the 
sum of these, or 10 points. 

Eight different matrices similar to the 
one above were constructed and presented 
to Ss sequentially. The Ss were told that 
both the total number of points each 
player had won as an individual and the 
total number of points each team had won 


_ would be determined after the experiment, 


and that monetary rewards would be given 
both to persons who had accumulated the 
most points as individual players and to 
groups who had accumulated the most 
points as a team. The possible outcomes of 
various combinations of responses were 
made known to each S; however, partici- 
pants were given no knowledge of their 
partner’s choices while playing the game. 

Several factors may affect decisions in 
situations such as the one described: One’s 
expectancy for what his partner will do, 
the effect of one’s choice upon both his 
own and his partner’s outcomes, the rela- 
tive value attached to team and individual 
goals, etc. The dynamics underlying de- 
cision-making processes in these situations 
are complex and have been studied ex- 


tensively by Deutsch and his colleagues 
(ef. Deutsch, 1962). The relative contri- 
bution of these factors can be manipulated 
fairly successfully by varying the relative 
magnitude of outcomes in the decision 
matrix. In the present study, a set of 
matrices was constructed that would al- 
low inferences to be made concerning each 
S's relative preference for seeking individ- 
ual goals (based upon the number of 
points he accumulated as an individual 
player) and team goals (based upon the 
combined number of points accumulated 
by himself and his partner), For example, 
in the matrix presented above, the selec- 
tion of X» would be assumed to indicate 
a team orientation, while the selection of 
X, would be assumed to indicate a decision 
to seek individual goals. 

A preference for individual rather than 
team goals may have several determinants. 
For example, it may indicate a desire to 
receive recognition as an individual rather 
than as a member of a winning group; or, 
it could be one manifestation of a general 
tendency to be socially aloof and auton- 
omous. Alternatively, a high team orienta- 
tion could indicate a desire to please or 
ingratiate one’s partner. More simply, it 
may indicate that S has been reinforced 
more frequently in the past for cooperative 
activity than for independant activity. 
Regardless of these factors, the relation- 
ship between academic performance and 
the tendency to seek goals cooperatively 
rather than individually was expected to 
have implications for the effect of socially 
directed behavior upon academic effective- 
ness. For example, if academic achieve- 
ment is facilitated by 4 general tendency 
to persevere in achievement-related ac- 
tivity independently of other persons, & 
negative relationship between academic 
performance and the frequency of team- 
oriented responses would be expected. On 
the other hand, if a general tendency to 
cooperate with other persons in achieve- 
ment-related activity increases academic 
effectiveness, the relationship between 
team orientation and academic perform- 
ance may be positive. 
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MeEtTHOD 


Selection of Subjects 


Participants in the study were selected from 
freshman liberal arts and science students at an 
urban midwestern university for whom measures 
of aptitude and performance were available. Per- 
formance (Per) was measured by first-quarter 
grade-point average. Aptitude (Apt) was meas- 
ured by composite score on the American College 
Testing Service college entrance examination, De- 
fined in this way, aptitude was not considered 
to be an index of intelligence, but rather was as- 
sumed to reflect the degree to which a student has 
mastered general intellectual skills prior to enter- 
ing college, and therefore a predictor of the level 
of performance he should obtain with an average 
amount of effort. 

Measures of Apt and Per were each converted 
to 2-scores and Ss placed into four categories ac- 
cording to the following criteria: 

1. High Apt, high Per—z-scores of greater than 
50 for both Apt and Per, and an absolute differ- 
ence between Apt and Per z-scores of less than 30. 

2. Low Apt, low Per—z-scores of less than —.50 
for both Apt and Per, and an absolute difference 
between Apt and Per z-scores of less than 30. 

8, High Apt, low Per—An Apt z-score of greater 
than 50, a Per z-score of less than —.50, and a 
difference between Apt and Per z-scores of greater 
than 1.30, 

4, Low Apt, high Per—An Apt z-score of less 
than —.50, a Per z-score of greater than 50, and a 
difference between Per and Apt z-scores of greater 
than 1.30. 

Sixteen males and 16 females were selected at 
each of the four combinations of Apt and Per and 
recruited for the experiment. They were each paid 
$1 for their services. 


Construction of Matrices 


Two criteria were used in preparing decision- 
making matrices. First, S’s choice should une- 
quivocally reflect a decision to pursue either a 
team goal or an individual goal; to meet this cri- 
terion, any alternative that maximized the number 
of points that the group would receive minimized 
the number of points S would receive as an indi- 
vidual player, and vice versa. Second, §’s decision 
should depend minimally upon the response he 
expects his partner to make. This was done either 
by making the partner’s choice clear, or by insur- 
ing that an S's choice could be interpreted simi- 
larly regardless of the partner’s choice. The eight 
matrices selected, in the order of their presentation 
to Ss, are shown in Table 1. (In each case, assume 
that S is Player X.) 

The effect of S’s choice upon team outcomes 
relative to the effect of his choice upon individual 
outcomes varied over matrices. To indicate the 
extent of this variation, the effect of ’s choice 
upon the number of points he received as an indi- 
vidual player (I) was determined by subtracting 
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the number of points he would receive by making 
one selection from the number he would receive 
by making the other choice, The choice that Player 
Y would be expected to make in responding to 
Matrices 2 and 4 were clear; for these, the number 
of points X would receive given this choice (Y,) 
was used in the calculations. For the other 
matrices, in which Y might be expected to make 
either response, the mean number of points X 
would receive as an individual player was averaged 
over Y’s alternatives. The effect of S's choice upon 
team outcomes (T) was determined similarly, 
The relative effect of S’s choice on team outcomes 
relative to individual outcomes (Drr) was then 
calculated for each matrix by subtracting I from 
T. Values of I, T, and Dz: for each matrix are 
shown in Table 1. 

It could be argued that Ss who adopt a team 
strategy in responding to these matrices do so not 
because they have interest in cooperative pursuit 
of goals per se, but because they believe this 
strategy to maximize the probability of receiving 
@ reward, independent of its group or individual 
nature, This, however, seems unlikely, Which 
strategy is really optimal is difficult to discem, 
For example, on Matrix 1 it appears that if X 


TABLE 1 


Summary or Marrices Presunren To Sussxcrs 
AND THE Errecr or Cxorces on OUTCOMES 


‘Partner's choices] Average effect of choice 
Matrix | Suplect’s on | Onin- 
choi if 
Bote Su lt#: beemlccseesd a 
qa | © 
1 > i Om Ss a 2 
X, | 8,2 | 3,5 
2 X, | 54] 4,2] 2 | 3 | -1 
X, | 2,9 | 5,3 
3 X | 5,5 | 6,6] 1 1.5] -5 
X; 6,3 | 8,3 
4 X: | 5,5] 6,4] 2 | 2» | 0 
X: | 3,9 | 4,8 
5 X, | 55] 0,7] 4 | 15) 35 
X; | 7,0] 1,1 
6 X, | 74/63] 2/1 | 
X: | 6,6 | 4,7 
Tulip Xa |9,2 |- 7181.4 | tate 
X: | 8,8 | 6,7 
Be pexnol £01 6)| 7 | 1. jem 
Xi | 4,9 | 3,8 
ek © TPIS PSI] ERNE me Se 
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* Calculation based upon assumption that 
Partner would select Y; . 
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played individually he would have an excellent 
chance of attaining @ group goal as well as indi- 
yidual goal should Y use a group strategy, and 
would not lose much (relative to Y) should Y 
play jndividually. On subsequent matrices, how- 
ever, a group strategy often appears optimal. 
Furthermore, it should be noted that if X plays 
for a team goal he increases YY’s chances of receiv- 
ing both a group goal and an individual goal; a 
decision to do so therefore seems highly coopera- 
tive. 

Tf the total points attainable by X and Y under 
the four possible combinations of strategies used 
by the two players are calculated, it can be seen 
that the optimal strategy for Y would be to play 
individually, because he will maximize individual 
outcomes and, provided X adopts a group strategy, 
would also have high team outcomes. X, if he 
were to realize this, would be confronted with the 
decision to take a chance on winning & group goal 
by obtaining high (87-92) but not optimal (100) 
payoff, or to try for a high (38-43) but not optimal 
(55) individual score. If he decides the former, 
he decreases his chances for an individual reward 
(since Y will be above him) but remains high 
in the running for a group goal regardless of Y’s 
behavior. On the other hand, if he decides to seek 
an individual goal, he insures that he is higher 
than at least one other competitor (Y). The diffi- 
culty of determining which orientation will maxi- 
mize the likelihood of receiving a reward seems to 
justify the assumption that differences in orienta- 
tion primarily reflect differences in the relative 
preference for pursuing goals cooperatively versus 
individually. (In this regard, Ss typically took less 
than 10 minutes to complete the task; since the 
time required to determine the optimal strategy 
to use across matrices would be substantial, this 
fact also argues against the alternative interpreta- 
tion in question.) 


Administration Procedure 


Four or five pairs of Ss were administered the 
task simultaneously. In each case, partners were 
of the same sex. In four instances in which an S 
did not show up for the experiment at the sched- 
uled time, the “odd” S was informed that # person 
had been left over during a previous session and 
was told to assume that this person was his part- 
ner, 

Partners were placed beside one another at long 
tables, far enough apart so that they could not see 
each other’s work. They were given booklets con- 
taining one sample matrix and the eight test 
matrices described above. One member of 
team was designated Player A and the other 
Player B. The Ss were led to believe that the 
matrices they were presented indicated both their 
own and their partner’s outcomes. In fact, the 
forms distributed to all Ss were identical except 
for the sample matrix and their designation as 
either Player A or Player B. (In Table 1, Player 
X was always the S, regardless of whether he was 
formally assigned to be A or B.) 


To explain the task, Ss designated as Player A 
were presented the sample matrix below: 


Player B 
B B: 
Aa 5,6 3,6 
Player A A 53 43 


The Ss designated as B were presented a similar 
matrix, rearranged so that B’s outcomes were listed 
first. All Ss were read the following instructions: 


We are interested in determining how persons 
behave when their behavior affects not only 
their own goals but the goals attained by others. 
I am going to ask you to play a game with the 
person next to you. One player, labeled A, will 
be able to choose either A: or A,; the second 
player, B, will have to choose between alterna- 
tives marked B, and By. On any given trial, 
each player will be awarded a certain number 
of points. The number of points he wins will 
depend not only upon his own choice but also 
on the choice of his partner. 


The Ss were then referred to the sample matrix 
on the first page of their booklet and the outcomes 
of each combination of choices were explained. 
The instructions then continued: 


On the form I have passed out there are eight 
tables similar to this one. Below each table, the 
possible combinations of choices and outcomes 
are written down. Both you and your partner 
will have three trials in each game. At no time, 
however, will you know how your partner has 
moved before making your own decision. In 
planning your move, you will therefore have to 
guess how he is likely to respond, The number 
of points each player wins will be determined 
after the game by comparing the choices each 
player has made on corresponding trials. 

To provide an incentive to perform well on the 
task, and also to make clear to Ss that they could 
work either for individual goals or for team goals, 
the following additional instructions were read: 


Each player will receive a score based upon 
the total number of points he has accumulated. 
On the other hand, each team will also receive 
a score based upon the total number of points 
both partners together have accumulated. To 
make the game interesting, we will award a 
prize of $2.50 to each of the ten individuals in 
the entire experiment who accumulate the great- 
est number of points for themselves, and a prize 
of $2.50 to each player on the five teams who 
have accumulated the greatest number of points 
as a team. There are 128 persons competing on 
64 teams; your chance of winning either a team 
prize or an individual prize is therefore about 
one out of six. 


Bach S was asked to make 3 responses to each 
matrix. For each S, the number of team-oriented 
responses, or the number of responses that would 
maximize the total number of points awarded to 
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8 and his partner, was determined for each matrix 
and was analyzed as a function of aptitude, per- 
formance, and sex. 

RESULTS 


Four Ss who were selected for the study 
were unable to participate; three more 
Ss did not understand the instructions and 
recorded their answers incorrectly. To ob- 
tain proportional cell frequencies neces- 
sary for analysis of variance, seven more 
Ss were eliminated at random from various 
cells. The final sample consisted of 15 Ss 
of high ability and 13 Ss of low ability at 
each combination of sex and performance. 

Some indication of the effectiveness of 
the experimental procedure in producing 
team-oriented and individual-oriented be- 
havior could be obtained by comparing 
the frequency of team-oriented responses 
to a particular matrix with the magnitude 
of the effect of these responses on team 
outcomes, If Ss understood the experimen- 
tal task and the consequences of their 
choices, they should generally make more 
team-oriented responses when the magni- 
tude of team outcomes was relatively more 
affected by their choices. This appeared 
to be the case. The correlation between 
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the mean number of team-oriented ye. 
sponses to each matrix and Dy was 77 
(N = 8, p < .025). The correlation calcu. 
lated for each S and then averaged over 
Ss was lower (M = .21, N = 112), but 
also was in the direction expected. 

An extension of a Lindquist Type II 
analysis of variance was performed on the 
number of team-oriented responses as a 
function of sex, aptitude, performance 
and type of matrix. (Data relevant to the 
analysis are shown in Table 2.) This analy- 
sis yielded significant main effects of per- 
formance (F = 5.07; df = 1/104; p < 
.05; MS, = 2.81) and matrix (F = 10.08; 
df = 7/728; p < .01; MS, = .445) and 
a significant interaction of sex, perform: 
ance, and matrix (F = 2.10; df = 7/728; 
p< .05). 

The main effect of matrix type indicated 
that team orientation increased with Dm, 
as also noted in correlation analyses. Low 
performers were less team-oriented (M = 
1.74) than high performers (M = 2.15), 
To explore the contingencies indicated by 
the significant interaction, supplementary 
analyses were performed involving (q) 
only the four matrices for which Day was 


TABLE 2 
Maan Nomser or TzaM-Ontentep Cxorces as A Function or Aprirupm (Apr), PeRFORMANOT 
(Per), Sex, anp Typx or Matrix 


Number of matrix 
Group 2 3 4 6 1 5 7 8 
Brena eaeeearees u 
Darr = —1|/Drr = —.5| Drr=0 | Drr=1 | Drr =2 |Drr = 2.5| Dri ="3 | Dri ='6 

Low Apt, low Per 

Males 1.46 | 1.53 | 1.00 | 1.54 | 1.77 | 2.15 | 2.23 | 2.07 | 1.2 

Females 1.46 | 1.77 | 1.77 | 2.00 | 1.38 | 2.083] 1.62 | 2.31 | 1.8 
; 1.46 | 1.65 | 1.38 | 1.77 | 1.58 | 2.16 | 1.92 | 2.19 | 1.7% 
High Apt, low Per 

Males 1.33 | 1.67 | 1.13 | 1.80 | 1.40 | 2.00 | 1.67 | 1.60 | 1.58 

Females 1.60 | 1.93 | 1.60 | 1.93 | 1.60 | 2.00 | 2.13 | 2.13 | 1-8 

1.47 | 1.80 | 1.37 | 1:87 | 1.50 | 2.00 | 1.90 | 1.87 | 1-7 

Low Apt, high Per 

Males 1.85 | 1.85 | 1.54 | 1.61 | 1.69 | 1.54 | 1.60 | 1.69 | 1.8 

Females 177 | 1.85 | 1.69 | 2.08 | 1.92 | 2.08 | 2.23 | 2.15 | 1.07 

M 1.81 | 1.85 | 1.62 | 1.81 | 181 | 1.81 | 1.96 | 1.92 | 18 
High Apt, high Per 

Males 1.73 | 2.00 | 1.87 | 2.20 | 2.00 | 2.00 | 2.27 | 2.93 | 2:0 

Females 1.80 | 2.20 | 1-80 | 2:97 | 240 | 2.27 | 2.60 | 2.40 | 2% 

M 177 | 210 | 1.83 | 177 | 220 | 2.13 | 2.43 | 2.37 | 28 

Note.—Matrices are listed in the order of their increasing effect of choice on the team outcomes 


For low Apt groups, N = 13 males, 13 females; for high Apt groups, N = 15 males, 15 females. 
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TABLE 3 


ANALYSIS OF VARIANCE SUMMARIES OF Data Invonvine Low-Dm Marnices AND Data 
Invorvine Hiex Dr: Marriczs 


Low-Dri matrices (2, 3, 4, 6) High-Dry matrices (1, 5, 7, 8) 
Source 
af MS F af MS F 

Sex (A) 1 4.93 3.14 1 4,72 2.66 
Aptitude (B) 1 1.91 1.22 1 2.08 ays 
Performance (C) Le 9.43 6.02* 1 5.14 2.90 
AXB 1 16 1 79 
AXC 1 1.39 1 2.28 1.29 
BXC 1 -60 1 8.23 4.64* 
AXBXO 1 -00 1 3.28 1.85 
Error (b) 104 1.57 104 1.77 
Matrices (D) 3 3.75 8.18** 3 2.28 5.68** 
AXD 3 84 3 219 
BXxD 3 37 3 14 
CcxD 3 10 3 1.55 3.85* 
AXBXD 3 54 1.17 3 54 1.34 
AXCXD 3 40 3 -58 1.48 
BxCxD 3 16 3 16 
AXBXCXD 3 -03 3 82 
Error (w) 312 458 312 402 

Note.—F-ratios < 1.0 are not shown. 

*p < 05. 

wy < Ol. 


lowest: (Matrices 2, 3, 4, and 6) and (b) any other level of aptitude and perform- 
only the four matrices for which Dar was ance, and differed from Ss in the other 
greatest (Matrices 1, 5, 7, and 8). These three cells combined (M = 187; F = 
analyses are summarized in Table 3. 815;p< 01). | : 

Analyses of responses to low-Dr mat- The significant interaction of perform- 
rices yielded significant effects of matrix ance and matrix type appears due to the 
type and performance but no significant fact that performance was related posi- 
interactions. Data relevant to these analy- tively to the number of team-oriented re- 
ses, shown in Table 4a, indicate that high sponses to all matrices except Matrix 5, 
performers were significantly more tearm: where the relationship was nonsignificantly 
oriented (M = 1.89) than were lower negative. This matrix is similar to the 
performers. This relationship was equally 
es both levels of aptitude. os 4 TABLE 4 
_Analyses of responses to high-Dax Mate vegan Noone or Team-ORISSMEE RESPONSES ON 
tices yielded a significant ma‘ effect of ~ Jarmions Low 1n Dri AND MaTRrcné Hiau 1N 


matrix type. However, unlike the analyses ‘Drr As A Function oF APTITUDE, 
of low-Dm data, performance interacted [ai id ea 


significantly both with aptitude and with Males Females 
type of matrix. Data relevant to these rs ee 
interactions are shown in Table 4b. Matrices Low | High Low | High 
The significant interaction of aptitude es bee | eee sbran- | ae 
‘ance | ance ance | ance 


and performance indicates that @ positive 
relationship between performance and the 


number of team-oriented responses 0¢- ah aptitude 1s | 108 | 18 Lit | 2.02 | 188 
curred only among students of high apti- if” sonia 1.98 | tba | 1.04 | 1.76 | 1.04 | 1.85 
tude (F = 7.38, p < 01). Tn fact, high High paeude 4.07 | 2.15 | 1.91 | 1.97 | 2642 | 2-00 
performers of high ability were more team jp" NN 3.08 | 105 | 8 | Eon | a7 | 2:00 


oriented (M = 2.28) than were Ss at PA iiser aeedass oe 
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well-known “prisoner’s dilemma” in which, 
if both players try to maximize personal 
gain, they decrease both individual and 
group payoffs. Such a matrix may intro- 
duce additional factors into the decision 
making that distinguishes it from the 
others used. 

It was speculated that differences be- 
tween males and females in the relation- 
ship between performance and the fre- 
quency of team-oriented responses might 
occur as a result of sex differences in the 
relationship of academic achievement to 
social goal attainment. This i 
was not supported on the basis of the 
above data. When choices had little effect 
upon team outcomes, the frequency of 
team-oriented responses increased with 
performance among all Ss. When the 
effect of choices upon team outcomes was 
greater, the number of team-oriented re- 
sponses increased with performance level 
among high-aptitude students of both 
sexes but was not substantially related to 
performance among either males or fe- 
males of low measured ability, Among 
males, the relationship was nonsignificantly 
negative, 


Discussion 


a relative! it; 
effect upon team outcomes, the pe 
to make team-oriented choices was related 
positively to performance among high- 
aptitude students of both sexes, but was 
unrelated to performance among low- 
aptitude students. 

_ The results of this study therefore call 
into question the general validity of the 
assumption that the likelihood of academic 
Success is greater among students who are 
not socially motivated and who therefore 
are more apt to pursue academic goals 
without being distracted by competing 
social interests, While high academic per- 
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formance may often require concentrated 
independent effort, the tendency to 
operate with others in mutual pursuit, of 
achievement goals may generalize to the 
academic environment and may result in 
an increase in academic effectiveness, To 
this extent, social orientation may actually 
facilitate achievement in college, 

The only qualification to the interpre. 
tation offered above concerns overachieving 
males (high performers of low mess 
ured ability). These Ss were less team 
oriented than low performers of low ability 
when the effect of their choices on team 
outcomes was high. Furthermore, supple 
mentary analyses of individual correlations 
between team orientation and Dzy indicated 
that male overachievers, unlike Ss in any 
other academic category considered in this 
study, actually decreased their team orien- 
tation slightly when team outcomes were 
more affected. It may be speculated that 
overachievement among males results in 
part from a general desire to be recognised 
as an individual for success in goal-directed 
activity. Overachievement among females, 
however, would not have similar roots. f 

Students who perform poorly despite 
high ability might be expected to show 
little interest in being recognized for per 
sonal achievement. They nevertheless ap- 
pear to prefer to seek goals independently 
of other persons. In this regard, Wye 
(1967) reported that when an incentive to 
perform well on a judgmental task was 
provided, underachievers of both on 
conformed less to group estimates than di ‘| 
students at any other level of aptitude a0 
performance. This finding supports the 
View that underachieving students typ 
cally prefer not to rely upon others 
achievement-related activity. f 

Other interpretations of the reel 
this study are plausible. In this 
although high-ability, high-performing s : 
dents of both sexes were relatively tea” 
oriented, a possible distinction pene 
males and females may be worth cons! ‘he 
ing. Wyer (1967) found that when ae 
incentive to perform well on a juds™ 
tal task was minimized and social ve 
attractiveness increased, high-aptitu 


————— 
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high-performing males conformed more to 
fictitious group norms than did males at 
other levels of Apt and Per, but high- 
aptitude, high-performing females con- 
formed less than did females not fitting 
this description. Conformity under such 
conditions may be due primarily to con- 
cern over being accepted by other group 
members (Deutsch & Gerard, 1955; Wyer, 
1966). Therefore, while the team orienta- 
tion of high-performing males of high 
ability may be due in part to their con- 
cern over being liked or accepted by their 
partners, similar behavior among high- 
performing females of high ability may 
have different determinants. Females may 
have generally less interest in personal 
achievement than males. Those who per- 
form well academically may be satisfied 
with the recognition they have received for 
individual achievement, and therefore may 
tend less to seek recognition in nonaca- 
demic situations. 

Some of the fundamental questions con- 
cerning the motivational and behavioral 
correlates of academic achievement are 
still unanswered by the present study. For 
example, the behavioral characteristics of 
students in achievement-related activity 
that does not involve other persons are yet 
to be delineated. The validity of assump- 
tions that underachievers have less interest, 
in recognition for personal achievement 
than other persons, or that overachievers 
are more apt to persevere in achievement- 
related activity, has not been tested di- 
rectly. More specific questions were also 
raised by this study which should be con- 
sidered in further research in this area, 
Specifically, differences between matrices 
in the relative effect of choices upon team 
and individual goals were due primarily 
to differences in the effect of choices 
upon team outcomes. Since relationships 
involving performance were contingent 
upon the type of matrix involved, the use 
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of matrices in which choices affect in- 
dividual outcomes to a relatively greater 
extent than team outcomes should be con- 
sidered. It may also be fruitful to explore 
choice behavior under conditions in which 
the likelihood that partners would be ex- 
pected to make team-oriented choices is 
systematically varied. Finally, all groups 
used in this study were homogeneous with 
respect to sex; situations in which partners 
are of the opposite sex might produce sub- 
stantially different results from those re- 
ported here. 
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COGNITIVE CONFLICT AND REVERSIBILITY TRAINING Iy 
THE ACQUISITION OF LENGTH CONSERVATION 


FRANK B. MURRAY 
University of Minnesota 


A child conserved the length of 2 equal sticks that were made to look 
unequal by the Miiller-Lyer illusion if he saw the illusion and de- 
spite it, maintained that the sticks were the same length. Nonconsery- 
ers who were trained by a reversibility and cognitive conflict 
procedure did significantly better than untrained nonconservers on 
the Miiller-Lyer task and on a transfer task with the same sticks dis- 
torted by the Oppel inverted T illusion (p < .01), There was no 
significant difference between the trained and untrained nonconserv- 


ers on a transfer task in which 
Jastrow illusion. 
Whatever the origins of our conception 
of length we are deeply convinced, as 
Einstein (1957) observed, that the length 
of a thing is constant even if its position 
is changed. The notion of the conservation 
of length refers to just this deep convic- 
tion, namely the recognition that the 
length of a thing is variant only under the 
transformations or operations of addition 
and subtraction, or in some cases from ex- 
pansion or contraction due to temperature 
change or the sheer mechanical forces of 
stretching and compression. Length is in- 
variant or conserved under such irrelevant 
transformations as position change, moyve- 
ment, filled space, perceptual distortion, 
ete. 


Piaget, Inhelder, and Szeminska (1960) 
found, in studying the conservation of 
length, that when two identical sticks were 
evenly placed one under the other, children 
correctly judged that the sticks were the 
same length. However, when one of the 
sticks was moved slightly to the Tight of 
the other, children under the age of 8 gen- 
erally thought the moved stick was longer 
than the unmoved one, that is they failed 
to conserve length in this instance, Chil- 
dren under the age of 8 also judged a 
straight line and sinuous one to be equal in 
length when the lines’ end Points were 
even. These children, moreover, persisted 
in claiming that the lines were equal after 
the sinuous line had been stretched to show 
its greater length and returned to the origi- 
nal position with its end points flush with 
those of the straight line. Children also 


equal areas were distorted by the 


Judged two unequal lines, straight and 
curved, whose end points were all even, to 
be the same length. That length was not 
conserved by children under 8 in these in- 
stances was found in a replication by Lov- 
ell, Healey, and Rowland (1962). 

It might have been the case that in the 
above studies the children and the experi- 
menter meant different things by length, 
the former denoting by it only the position 
of the lines’ end points, a strategy that 
would, more often than not, produce sue- 
cessful encounters with the different lengths 
of things in the child’s world. A previous 
study (Murray, 1965) has shown that if the 
equality of two lengths whose end points are 
even is distorted by some common geomet- 
rical illusions, nonconservation is still found 
in children under 8, and therefore must Te 
sult from more than just the misconception 
of length based only on the relative position 
of the extremities of lines. ae 

The phenomenon of nonconservation ° 
length itself is surprising in children nee 
8 years old, particularly since average chil- 
dren between 3 and 314 years of age 0a? 
Successfully (82% of the time) follow 
verbal instructions to pick out the Jongét 
of two sticks on the Binet test. More su 
prising than the phenomenon of noncone 
vation itself, however, is the finding by 
conservation learning or training au 
that even the most sound and reas f 
training procedures “have had remanie 4 
little success in producing cognitive a 
(Flavell, 1963, p. 378].” The evidence fol 
training studies of the conservatiol 
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number (Wohlwill & Lowe, 1962), con- 
servation of substance and weight (Smeds- 
lund, 1961a, 1961b, 1961c), conservation 
of area (Beilin & Franklin, 1962) and 
conservation of length (Smedslund 1963a, 
1963b) indicates that a child cannot be 
taught the concept in question unless he 
has developed the particular cognitive 
structure in question and this structure 
does not seem significantly altered by ex- 
ternal reinforcement or manipulation. 

Smedslund (1961d) proposed that a state 
of cognitive conflict was the precursor of the 
cognitive reorganization that was required 
to support conservation. The proposal is 
consonant with the Piagetian notion that 
problems provoke cognitive disequilibrium, 
the resolution of which requires a new inte- 
gration of distinct operations, such as simul- 
taneously, instead of separately, attending 
to height and width in substance conserva- 
tion or both end points in length conserva- 
tion. Smedslund (1961d) created the conflict 
by changing the shape of a clay ball to make 
it appear larger when a piece had been 
taken away to make it actually lighter. 
Smedslund (1963b) has had mild success 
with this method of the juxtaposition of 
competing forces in training the conserva- 
tion of length. Gruen (1965) has used the 
technique in training number conservation 
and found it was successful only when it 
was coupled with verbal pretraining. The 
cognitive conflict technique was used in the 
present study to the extent that subjects 
(8s) were directed to perform actions that 
made the same stick appear longer and 
ae shorter than an equal length 
Stick. 

The technique was based primarily on 
the notion that the cognitive operations 
that support conservation are reversible, 
for example, the operation that made the 
equal length sticks unequal can be undone 
by the inverse operation. To conserve any 
Property the child presumably needs the 
tule that allows him to get from the origi- 
nal state to the transformed state and back 
again. The notion of reversibility is for 
Piaget the defining property of the cogni- 
tive operations that support the conserva- 
tions, Wallach and Sprott (1964) have 


been successful in training number conser- 
vation with a reversibility procedure, but 
Beilin (1965) was unsuccessful in training 
number and length conservation with it, 
The reversibility procedure used in this 
study produced a conflict between the orig-' 
inal and transformed states that could be 
resolved by allowing S to perform the ac- 
tions that connected the two states. 


Merxop 


Subjects 


All 119 Ss, 69 boys and 60 girls, were enrolled 
in the kindergarten, first, and second grades of the 
Lida Lee Tall Laboratory School in Towson, Mary- 
land. The mean age for each grade was 5.76 years 
(SD = 30, range 5.25-6.16), 6.71 years (SD = 33, 
range 6.25-7.16), and 7.79 years (SD = 34, range 
7.25-8.16) respectively. The mean IQ from the 
SRA Primary Mental Abilities Test was 1124 
(SD = 11.4) for the first grade and 112.5 (SD = 
10.9) for the second grade. Test results for kinder- 
garten were not available. There were in all 31 
kindergartners, 42 first, and 46 second graders. 

‘All testing was done individually in a small 
conference room in the laboratory school. Hach S 
faced the experimenter across a table, which held 
the stimulus materials used in the study. 


Materials 


The materials for the length conservation test 
were: (a) 10 round white sticks, 5 millimeters in 
diameter, two of which were the same length (21 
centimeters), four were unequally longer than 21 
centimeters and the rest, shorter, (b) a black rec- 
tangular (30 X 50 centimeters) composition board 
on which were glued three cardboard figures—the 
arrows and feathers of the Miiller-Lyer figure in 
horizontal position (see Figure 1). The two seg- 
ments of the arrows and feathers were 5 milli- 
meters X 7 centimeters and at right angles to one 
another. 


Procedure 

The experimenter introduced all tasks as a 
game, and explained his notetaking as keeping 
score so he wouldn’t forget how well Ss did. Be- 
fore the test of length conservation, Ss were pre- 
sented a long and a short stick from the 10 sticks 
described above, and asked to find the longer one. 
The Ss were asked then if they could find two 
sticks in the 10 that were the same length or size. 
If § had difficulty with the multiple discrimina- 
tions, the experimenter prompted by pointing to 
two unequal sticks and asking about their length, 
and then pointing to the two equal sticks and ask- 
ing about their length. The children’s responses to 
these preliminary questions ensured that Ss under- 
stood the directions in the way that the experi- 
menter intended. 
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A 8 c 
Fic. 1. The Miiller-Lyer illusion (A), Oppel in- 
verted T illusion (B), and the Jastrow illusion 
figures (C). 


The experimenter took the equal sticks that 8 
had selected and placed them upright side by 
side on the table to enable § to reconfirm their 
equality, before the experimenter slowly placed 
them in the Miiller-Lyer configuration to make 
their lengths appear unequal. One-half Ss in each 
grade had the “feathered figure” on the left, and 
the rest had it on the right. The Ss were asked: 


Does this stick (experimenter pointed to the 
stick in the “feathered” configuration) look 
longer than this stick (experimenter pointed to 
the “arrowed” stick), does it look the same 
length as this stick, or does it look shorter than 
this stick? 
To nullify any leading influence the order of the 
alternatives (longer, shorter, the same length as) 
might have on S's response, the order of the alter- 
natives was varied randomly among S. If S an- 
swered that the sticks looked the same length, 
that is, he failed to see the illusion, he was dis- 
carded from the experiment. The Ss were asked a 
second question: 


If the sticks were standing up the way they 
were before, would this stick (experimenter 
pointed to the stick in the feathered configura- 
tion) be longer than this stick (experimenter 
pointed to the other stick), would it be the same 
length as this stick, or would it be shorter than 
this stick? 
Again, the order of the alternatives was varied 
randomly among Ss. 

One week from the session in which the above 
procedures and others described in Murray (1967a, 
1967b) were administered, Ss who were scored as 
nonconservers on the Miiller-Lyer problem were 
again given that length conservation test. All Ss 
who could still be scored as nonconservers were 
randomly divided within each grade into a contro] 
group and an experimental group that would re- 
ceive training in length conservation. 

The control group Ss met individually with the 
experimenter and were given the following tests: 
(a) the length conservation on the Miiller-Lyer 
figure described above, (b) length conservation 
test in which the sticks were held by the experi- 
menter in the form of the Oppel inverted T illu- 
sion, and (c) a size or area conservation problem 
in which two equal ring segments were held by 
the experimenter in the pattern of the Jastrow 
illusion (Figure 1). 
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In the inverted T problem, Ss were asked (q) 
if the vertical stick looked longer, ete., than the 
horizontal one (experimenter pointed to both 
sticks), and (b) if it was really longer, ete, Th the 
Jastrow figure, Ss were asked similar questions 
about the apparent and real size differences be. 
tween the bottom and top ring segments. 

Before the experimental group was subjected to 
the three conservation tests given to the control 
group, they were trained in the following manner; 
(a) the sticks in the Miiller-Lyer figure were 
placed by the experimenter before Ss as before, 
but S was allowed to pick them up to confirm their 
equality and himself replace them in the distorting 
figure, (b) S was permitted to pick up and replace 
the sticks two more times, (c) S was directed to 
try switching the sticks, that is, replacing the one 
from the feathered configuration in the place of 
the one from the arrowed configuration and vice 
versa, and (d) S was permitted to switch the sticks 
two more times. 


Rasvits 


All Ss were able to answer the prelimi- 
nary questions about unequal and equal 
length correctly. Each S was scored as hay- 
ing conserved length only if (a) he saw the 
Miiller-Lyer illusion, that is, said the 
feathered stick looked longer, and (b) 
despite seeing the illusion maintained that 
the sticks were the same length. If S main- 
tained that the sticks were unequal in 
length, he was scored, of course, as a non- 
conserver. 

A median test (Seigel, 1956) was ap- 
plied to the scores, and indicated a sig- 
nificant difference in conservation for the 
group above the median age (6.91 years) 
and for the group below it (x2 = 14.4,7 < 
.001). 

Chi square for differences in conserva- 
tion between males and females was 1 
significant for the group (,2 = 1.98, p < 
.20). Thirty-seven of the 38 Ss identifie 
€8 nonconservers on the length conserva 


TABLE 1 


RB 
Nomserr or ConseRveRS AND Nonconseare 
ABOVE AND BELOW THE GRrouPp MEDIAN 


or 6.91 Ypars 
ae 


Group Conservers | 
Above 51 i 
Below 30 

meer. Ae Utes ei) te 


Note.—Range 5.25-8.16 years. 
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TABLE 2 
Nomper or Conservers (C) AND NonconsERv- 
ers (NC) 1n Lenora CONSERVATION AND 
TransFER OF LengrH CONSERVATION 
TASKS BETWEEN TRAINING AND 
Conrrot Grours 


‘Transfer tasks 
Maller- 
Lyer task® 
Groups Oppel T* | Jastrow** 
c |Nc}] c |NC| C|NC 
Experimental 14} 1/1) 4/7] 8 
Control 0} 14] 1) 13] 0) 14 


Note.—For the experimental group, N = 15; 
for the control group, N = 14. 

*p < Ol. 

“p> 10. 


tion task were retested on the same task. 
Twenty-nine were still found to be non- 
conservers and eight were found to be con- 
servers, The 29 Ss randomly divided by 
grade constituted the control and training 
groups. 

On the Miiller-Lyer length conservation 
task (Table 2) there were by the Fisher 
Test (Siegel, 1956) a significantly greater 
number of conservers from the training 


_ group than the control group (p < 01). 


| 


On the test of length conservation transfer 
to the Oppel inverted T task, there were 
also by the Fisher Test a significantly 
greater number of conservers in the train- 
ing group than the control group (p < .01), 
but on the Jastrow size or area conserva- 
tion test. the difference in conservers be- 
tween the training and control groups was 
insignificant (p > .10). The significance 
levels are two-tailed and conservative since 
a greater number of conservers was pre- 
dicted in the group that had training. 


Discussion 


The transition from nonconservation to 
conservation was found to be between 6 
and 7 years of age. Comparisons of this 
finding with age norm findings in other 
studies must consider at least the follow- 
ing variables which have been found to be 
related to conservation: IQ of Ss, task com- 
plexity and materials, field-independence 
of Ss, and socioeconomic status, culture, 


and urban-rural location. In a previous 
study (Murray, 1965) the transition from 
nonconservation to conservation occurred 
between 7 and 8 years. The most parsimon- 
ious explanation for the higher transition 
age found in that study is that older chil- 
dren were Ss. Arguments could be ad- 
vanced, however, that the length conserva- 
tion task used in the present study was 
simpler and more concrete than the tasks 
used in the previous study. 

Piaget, Inhelder, and Szeminska (1960) 
found that length conservation was com- 
plete for 50% of those children aged 7-74 
years, and Vinh-Bang’s data (Smedslund, 
1963a) sets the 50% conservation level 
closer to 8 years of age. In the present 
study, 52% of the 5- and 6-year olds 
(median age = 6.16), 72% of the 6- and 7- 
year old (median age = 6.91) and 86% of 
the 7- and 8-year olds (median age = 
7.66 years) conserved length. It can be 
said that Piaget’s findings on the relation- 
ship between conservation and age were 
broadly confirmed. 

Although precautions were taken in the 
present study to ensure that Ss knew the 
sticks were the same length, it might be 
argued that this knowledge was forgotten. 
Lovell and Ogilvie (1960) point out, how- 
ever, that forgetting is not a critical factor 
in nonconservation, since many conservers 
could remember the original situation. 

It might be argued also that conserva- 
tion or nonconservation was found in the 
present study simply because the second 
question, namely, “How is it really?” sug- 
gested to Ss that a different answer was 
being looked for. McConnell (1963) has 
found, for example, that when children 
judge the relative sizes of two equal geo- 
metrical forms, they judge the forms to be 
unequal simply because the opportunity 
suggested itself in the question. McConnell 
found the suggestibility factor to be nega- 
tively related to age (6-18 years), al- 
though the curvilinear relationship showed 
no appreciable differences between 6-, 7-, 
and 8-year olds. If such a factor were op- 
erating in the present study, it would tend 
to lower the nonconservation-conservation 
transition age, by lowering the number of 


86 Faanx B. 
nonconservers, namely, Ss who gave the 
same answer to both questions. If it were 
the sole factor operating on the responses 
to the second question, one-half of the Ss 
at each age would be scored conservers; 
a binomial test failed to support the hy- 
pothesis, 

It is possible that by answering the sec- 
ond question indiscriminately, one-third of 
the Ss could have conserved the item by 
chance alone. A binomial test (p = %) 
on the proportions of nonconservers and 
conservers at each age indicated that only 
the youngest Ss (below age 6.16 years) 
could have conserved length in this prob- 
lem by guessing alone. 

As a result of reversibility and cognitive 
conflict. training, nonconservers acquired 
conservation in the Miiller-Lyer task they 
were trained on, and in the Oppel in- 
verted T task on which they were not 
trained. Training did not seem to be ef- 
fective in producing transfer to the area or 
size conservation task, The lack of transfer 
in trained conservation was found also by 
Gruen (1965) in which training in number 
conservation did not transfer to length and 
substance conservation, and by Beilin 
(1965) who found that training in length 
and number conservation did not transfer 
to area conservation. It would seem that 
the conservations are acquired for each 
concept separately, although it could be, as 
Smedslund (1961b) has shown, that con- 
servation that is laboratory acquired may 
not have as much depth or generality as 
conservation that is “naturally” acquired. 

Beilin and Franklin (1962) and Beilin 
(1965) have found that training in con- 
servation was more effective with older 
nonconservers than the younger ones, The 
success of the present training procedure 
may have resulted in part from the use of 
older Ss, though the median ages of the Ss 
in the present study and in Smedslund’s 
study (1963b) were virtually the same. 
Nevertheless, the age of nonconservers 
(median age = 6.16 years) in the present 
study was close to the age that length con- 
servation would begin to occur “naturally.” 

Other reasons for the effectiveness of the 
training procedure can be speculated on. 


Mornay 


Since forgetting the equality is not g sig. 
nificant factor in nonconservation, the 
child may lack an awareness of the Pos. 
sibility of return to the original situation, 
The child considers the two states to by 
static and separate, because the rules or ac. 
tion relating them are absent, Tohelder 
(1965) cites evidence to show that young 
preoperational children were unaware, for 
example, of the successive forms an ar 
would take in being straightened, while 
those that conserved could represent the 
intermediate stages of the transformation 
from are to straight line. It could be argued 
that the present training procedure, by al- 
lowing S to actively manipulate the sticks 
from one state to the other, facilitated 
S’s awareness of the relationship between 
the equal and unequal appearing states. 

The phenomena of nonconservation and 
conservation themselves are important 
facts in the psychology of subject matter. 
The fact that instruction may only be ef- 
fective when children are close to the time 
at which length is conserved naturally has 
obvious implications for the educational 
notions of readiness and curriculum se 
quence. The conserved concept of length, 
important in itself, is a requisite for the 
concepts of transitivity and measurement. 
It is clear from the present study that 
nonconservers can be taught to conserve; 
nevertheless, premature instruction in the 
concept may result in no more than the 
acquisition of verbal fluency which ie 
conceptual defect that has the magnitude 
of nonconservation itself. Additional Pe 
structional techniques, not unlike the og 
used in the present study, certainly cam 
devised to influence cognitive developmen 
and make its sources more explicit. 


REFERENCES 
rgence 


Bry, H. Learning and operational conve a 
in logical thought development. Journa att 
perimental Child Development, 1965, 


Bain, H., & Franxu, I. Logical operations 
area and length measurement: Age and 7618. 
effects. Child Development, 1962, 33; 607 

Ensram, A. Relativity, the special and the tor 
eral theory. (15th ed.) London: Mea Jean 

Fravetn, J. The developmental psychology 
Piaget. New York: Van Nostrand, 1963. 


Coenirive ConFiicr AND REversIpiuity TRAINING 87 


Gaver, G. Experiences affecting the development 
of number conservation in children. Child De- 
velopment, 1965, 36, 963-980. 

Iyuexpir, B. Operational thought and symbolic 
imagery. Monographs of the Society for Re- 
search in Child Development, 1965, 30, 4-18. 

Lovert, K., Heatny, D., & Row3anp, A. Growth of 
some geometrical concepts. Child Development, 
1962, 33, 751-767. 

Lovett, K., & Oomvie, E. A study of the conserva- 
tion of substance in the junior school child. 
British Journal of Educational Psychology, 1960, 
80, 109-118. 

McConnett, T. Suggestibility in children as a 
function of chronological age. Journal of Abnor- 
mal and Social Psychology, 1963, 67, 286-289. 

Mornay, F. B. Conservation of illusion-distorted 
lengths and areas by primary school children. 
Journal of Educational Psychology, 1965, 56, 


62-66. 

Murray, F. B. Conservation of illusion-distorted 
length and illusion strength. Psychonomic Sci- 
ence, 1967, 7, 65-66. (a) 

Murray, F. B. Some factors related to the conser- 
vation of illusion-distorted length by primary 
school children. 1967 AERA Proceedings, 1967, 
193-194. (b) 

Pucer, J., Inwewper, B., & SzemmnsKa, A. The 
child’s conception of geometry. New York: Basic 
Books, 1960. 

Strcet, S. Nonparametric statistics for the be~ 
rete sciences. New York: McGraw-Hill, 

SvmpsLunp, J. The acquisition of conservation of 
substance and weight in children. I External 


reinforcement of conservation of weight and of 

the operations of addition and subtraction. 

Sener Journal of Psychology, 1961, 2, 71- 
(8, 

Smepstunp, J. The acquisition of conservation of 
substance and weight in children. II Extinction 
of conservation of weight acquired “normally” 
and by means of empirical controls on a balance. 
poets Journal of Psychology, 1961, 2, 85- 

Smupstunp, J. The acquisition of conservation of 
substance and weight in children. IV Attempt at 
extinction of the visual components of the 
weight concept. Scandinavian Journal of Psy- 
chology, 1961, 2, 153-155. (c) 

Smenstunp, J. The acquisition of conservation of 
substance and weight in children. V Practice in 
conflict situation without external reinforcement. 
Scandinavian Journal of Psychology, 1961, 2, 
156-160. (d) 

Smepstunp, J. Development of concrete transitivity 
of length in children. Child Development, 1963, 
34, 389-405. (a) 

Smepstunp, J. Patterns of experience and the ac- 
quisition of conservation of length in children. 
Scandinavian Journal of Psychology, 1963, 4, 
257-264. (b) 

Wauiacu, L., & Sprorr, R. Inducing number con- 
servation in children. Child Development, 1964, 
35, 1057-1072. 

Woutwn1, J., & Lows, R. An experimental analy- 
sis of the development of the conservation of 
number. Child Development, 1962, 33, 153-167. 


(Received March 13, 1967) 


Journal of Educational Psychology 
1968, Vol. 59, No. 2, $8-03"— 


“OVERPROMPTING” IN PROGRAMMED INSTRUCTION: 


RICHARD C. ANDERSON, GERALD W. FAUST, an>D MARIANNE C. RODERICK 
University of Illinois 


108 Ss completed either the Standard version or a Heavily-Prompted 
version of a 1,052-frame section of a psychology program. Half of the 
Ss made constructed responses while the remainder were instructed to 
“think” the answers that went into the blanks. As expected, those who 
received the Standard program scored higher on the posttest and took 
longer to complete the program than those who received the Heavily- 
Prompted version. Response mode 
were interpreted as showing that arrangements of lesson material 
which permit the student to respond correctly without noticing the 
undermine 


cue performance. 

One of the stocks in trade of pro- 
grammers is prompting, a technique of 
providing information that helps the stu- 
dent to give a correct answer. Depending 
upon the context, a prompt may consist of 
a rule which can be applied to an ex- 
ample, a hint to help in the solution of a 
problem, the first letter of an answer, a 
synonym for an answer, or one of many 
other devices both simple and complex, A 
prompt is technically defined as a stimulus 
that already controls or partially controls 
a response. The instructional problem is to 
arrange a shift in stimulus control from the 
prompt to the discriminative stimulus or 
cue, a second stimulus (or set of stimuli) 
which prior to instruction does not control 
the response. The shift in stimulus control 
has been accomplished when the student 
can make the response when the cue alone 
is present, 

Much of the actual research on prompt- 
ing has entailed paired-associate lists. It 
has been repeatedly demonstrated that 
people learn faster under a prompting 
procedure, in which both the stimulus 
term and response term appear before the 
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machine scoring answer sheets and completing 
item analyses; to the McGraw-Hill Book Com- 
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and to James Holland for providing materials to 
teach people how to use the blackout technique. 
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made no difference. The results 


response is required, than under the 
ticipation method, or confirmation method 
as it has been called in these studies (e 
Cook & Spitzer, 1960; Levine, 1965; 
dowski, Kopstein, & Shillestad, 1961) 
the basis of these experiments, Cod 
(1963) has been willing to extrapolate 
instructional practice. He has argued 
the correct answer should be indicated 
the student before he makes his resp 
The implication is that there is no such | 
condition as “overprompting,” a state | 
affairs, warned against in programmed il | 
struction manuals, in which the response} 
said to be too completely determined by | 
the prompt. P 
We contend that under certain conditions 
stereotyped and repetitious use of prompt | 
frames can impair the effectiveness of PI | 
grammed materials. The argument is 
many students will begin to respond on? 
basis of the prompt alone, when the desigt | 
of the frame permits it, As a result, thell | 
behavior will not come under the com! 
of the cues because they will not have P 
attention to the cues. The deleterious 
of an “overprompted” program WO! 
expected to be most pronounced bees 
instance, the students are bored, tired, 
eager to finish quickly. 4 
Anderson and Faust (1967) develope 
stylized Russian vocabulary progta® oh 
which each frame consisted of @ pare 
of five sentences with English subjects ait 
Russian predicate nominatives. 
ately below the paragraph one of th 
tences was repeated with a blank i 
of the Russian word. In a second othe! 


for 
ot 


b 


ea 
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jdentical version of the program the Rus- 
sian word which was to serve as the re- 
sponse was underlined in every frame. Each 
frame in the No-Underline version required 
the student at least to notice the cue in 
order to respond correctly, since the only 
way to locate the Russian word that went 
in the blank was to find the English word 
with which it was associated. The Underline 
version, on the other hand, permitted the 
student to copy Russian words into the 
blanks without ever looking at the English 
words. As predicted, although both versions 
led to error-free performance during the 
program, on the posttest students who re- 
ceived the No-Underline version recalled 
significantly more Russian words practiced 
during the program than did students who 
received the Underline version. 

The chief purpose of the investigation 
described in this paper was to determine 
whether the analysis of overprompting 
which has just been described applies to 
actual lesson materials. At least one study 
suggests that it does. Kress and Gropper 
(1966) presented a 43-frame program on 
“elasticity” and a 44-frame program on 
“direct and inverse relationships” to 
eighth graders over closed-circuit televi- 
sion. Some groups saw the Nonprompted 
versions. Other groups saw the Partially- 
Prompted programs, in which the first let- 
ter of every response word was supplied. 
The remainder received the Fully- 
Prompted versions, in which the responses 
were printed entirely in capitals. The re- 
sults on the posttests showed that the 
More prompting the lower the achieve- 
ment. 

Of course, in practice, prompts are sel- 
dom used in quite such a heavy-handed 
4nd routinized fashion as they have been 
In the experiments on prompting that have 
been completed thus far. It remains to be 
E whether similar results can be ob- 


tained when a variety of strong prompts 
§re employed in a relatively unobtrusive 
Manner, 

A secondary objective of the present 
Study was to investigate the joint effects of 
4 of prompting and response mode. 


Mernop 
Subjects 


One hundred and eight students, mostly second- 
ary school teachers, enrolled in a summer course 
in educational psychology served as subjects (Ss). 
Participation was a course requirement for 63 Ss 
whereas 45 were volunteers who had the option of 
participating in the experiment in lieu of writing a 
paper. The Ss were randomly assigned to groups 
with the constraint that each contain proportional 
numbers of volunteers and nonvolunteers. A total 
of 12 other Ss were discarded for the following 
reasons: eight because they indicated (on a ques- 
tionnaire completed after the fifth program set) 
previous exposure to the program or considerable 
knowledge of the material contained in the pro- 
gram; two because they failed to complete the ex- 
periment; and two in order to equalize numbers in 
the treatment groups. 


Materials 


Two versions of the first 1,052 frames (25 sets) 
of the Holland and Skinner (1961) program The 
Analysis of Behavior were prepared. The Standard 
version was identical to the original, except for the 
physical arrangement of frames, while the Heavily- 
Prompted version was altered to include one addi- 
tional prompt on about 90% of the frames. The 
remainder of the frames were already heavily 
prompted. Prompts were introduced according to 
the following rules: (a) the response term was al- 
ways underlined in copying frames; (b) the ap- 
propriate article was used before each response 
blank and ambiguity about whether the response 
was plural or singular was removed; (c) in mul- 
tiple-choice frames the first alternative was always 
correct; (d) strong connectives (eg., therefore) 
were added when these emphasized existing 
prompts; and (¢) when nothing else seemed pos- 
sible the first letter or two of the response was pro- 
vided. 

The programs were mimeographed on 16-pound 
white dock and placed in booklets, one booklet 
for each set, with 3% X 8¥%-inch ages, tepled 
along the left margin. Each page cont a cen- 
ered frame above which was the feedback for 
the preceding frame, The exhibits were in a sep- 
arate booklet composed of longer pages, one ex- 
hibit per page. 


Procedure 


The experiment was conducted in a large, air- 
conditioned room containing space for 15-20 
people at long tables. Free coffee was provided. 
Each S scheduled himself to work on the program 
in 1-hour blocks over a 1-month period, determin- 
ing for himself how his time would be distributed. 
A library arrangement was employed wherein 8 
checked out a single booklet at a time and returned 
it when completed to a “Jibrarian,” who checked 
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for proper identification and then issued the next 
booklet if § wanted to continue. : 

The Ss paced themselves; however, they did 
record the time on a cover sheet immediately be- 
fore beginning a set and again on an end sheet 
immediately after completing the set. Since no one 
was allowed to stop in the middle of a set, training 
time was measured rather accurately. 

A questionnaire was administered immediately 
after S completed the last booklet to assess his 
attitude toward programmed instruction and to 
determine the way in which he went about com- 
pleting the program. Finally, a two-part posttest 
was given approximately 24 hours after S com- 
pleted his last booklet. The posttest consisted of 
20 short-answer questions, which were given first, 
and 51 multiple-choice items. The total posttest 
score consisted of the score on the short-answer 
section plus the score on the multiple-choice sec- 
tion corrected for guessing. Prior to the experi- 
ment the test was revised on the basis of data 
from several groups of Ss, both naive and sophis- 
ticated. Special attention was paid to the develop- 
ment of discriminating items for high achievers, 


Resvits 


As can be seen from Table 1, which 
contains analyses of variance, and Table 
2, which contains means and standard 
deviations, the main hypothesis was con- 
firmed. Those who received the Standard 
program learned significantly more than 
those who got the Heavily-Prompted ver- 
sion. 

Considering the fact that a relatively 
large number of Ss participated, the fact 
that a lengthy and, it was imagined, dis- 
criminating achievement test was used, 
and especially the fact that a rather long 
Program was employed, it was fully ex- 
pected that for those who received the 
Standard program the experiment would 
show a significant posttest advantage for 
overt responding. The fact that it did not 
(¢ = 1.07) gives us pause, Two other stud- 
ies (Holland, 1965; Williams, 1963) which 
employed sections of the same program 
have found an advantage for overt re- 
sponding. There were several differences in 
procedure between the earlier experiments 
and the present one that may account for 
the difference in results, First, in the 
Holland experiment and the Williams ex- 
periment Ss read the frames with the 
blanks filled in whereas in this experiment, 
Ss in the covert condition were instructed 
to “think” the answers that went in the 
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blanks. Perhaps the latter proced 
somewhat more effective than 
alone. Second, all of the Ss in the pre 
experiments were required to Particip 
in order to fulfill course require 
while about 40% of the Ss in the preset 
experiment were volunteers. Among thos 
who took the Standard program, voluntes 
who made overt responses achieved a pos 
test mean of 24.2 and volunteers who y 
instructed to make covert respon 
achieved a posttest mean of 24.1, In og 
trast, among those who took the Stands 
program and were required to participate, 
Ss who made overt responses showed 
total posttest mean of 26.4 and those y 
made covert responses showed a mean 
20.1. While the latter comparison was s 
not significant (¢ = 1.34), it did seem 
make a difference whether or not $ was 
volunteer. There is no reason to belie 
that there is a value to overtness for 
own sake. Covert responses are fine as lon 
as they occur. The problem is that covet 
responding may drop out after a period 
time. Apparently volunteers sustained 
vert, responding to a greater extent thal 
nonvolunteers. j 
Previous research (Williams, 1963, 1965) 
has indicated that an advantage from i 
quiring overt, constructed responses will f 
found primarily when the posttest r the 
S to supply novel, technical terms. In i 
present study there was no difference 
tween the overt and covert groups 02 
short-answer section of the posttest, ay 
question of which required S to supp ' 
technical term (¢ = .00). Nor was thee? 
significant difference between the overt & 


TABLE 1 ns 
Anatysts or Vartance or Tratnina Tim! 
Posrrest ParroRMANCE 


Al Posttest 
Training | petormi® 
Source aN IM Se 7 
MS 


TABLE 2 
Tratninc TIME AND Postrest PERFORMANCE 
Means anp SDs 


Training time Posttest 
(in minutes) performance 
Group MRRDARAI AAS ps 
M SD M SD 


Standard overt 470 | 114 | 25.3 | 13.5 
Standard covert 317 64 | 21.6 | 11.6 
Heavily-prompted 

overt 350 79 | 18.3 | 12.1 
Heavily-prompted 

covert 245 61 | 17.9 | 13.6 


covert conditions on the short-answer test 
considering only Ss who took the Standard 
version of the program (t = .86). Inci- 
dentally, there was a larger difference 
between the two program versions on the 
short-answer section of the test (F = 
10.69, df = 1/104, p < .01) than on the 
multiple-choice section (F = 2.60, df = 
1/104, p > .05). 

None of the posttest effects was quite as 
strong as had been expected. To our dis- 
tress we have discovered a possible reason 
for weak effects. On many frames it was 
possible to read the correct answer (which 
appeared on the top of the next page) 
through the paper. This was a prompt that 
we had not counted on, and it un- 
doubtedly reduced the difference between 
the Standard and Heavily-Prompted ver- 
sions (see Brown, 1966). This unplanned 
prompt may also have reduced the differ- 
ence between the overt responding and 
covert responding conditions for those who 
received the Standard program. One of the 
items in the questionnaire asked S to de- 
seribe any shortcuts he used in completing 
@ program. Perhaps it is significant that 
five of the eight Ss who reported that they 
Occasionally read the correct answer 
| through the paper were in the Standard 


Overt group. One S expressed the problem 
: He wrote, 


e 


Iwas annoyed at times by the fact that the correct. 


Tesponses on the next page were often visible be- 
cause of the thiness (sic) of the paper. In cases 
ie Thad already looked at these responses it's 
ficult to tell if I really thought out & wrote 
down the correct response, or just copied the re- 
Sponse that was visible through the page- 
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As one might expect, the error rate was 
lower in the Heavily-Prompted version 
(4.1%) than in the Standard version 
(8.0%) of the program (t = 3.08, df = 
52,p < 01). 

Figure 1 pictures the time in seconds per 
frame to complete the program sets for 
each of the treatment groups, Since the 
record of training time included time spent 
reading exhibits, four undergraduate vol- 
unteers were hired to read the exhibits for 
meaning in order to provide a basis for 
adjusting work rates. Each S was timed 
individually on each exhibit with a stop 
watch. The results, expressed in seconds 
per frame, appear in the bar graph at the 
bottom of Figure 1. The figure reveals 
that work rates were very stable for every 
treatment group across the entire pro- 
gram, 

The thesis is that the Heavily-Prompted 
version resulted in poorer achievement 
than the Standard program because it 
permitted attenuated inspection behavior. 
Tt is argued that students often were able 
to respond correctly on the basis of 
prompts without paying attention to the 
entire cue, even, for example, to the defini- 
tion of a technical term. As a result the 
cue sometimes failed to become a dis- 
criminative stimulus for the response and 
so, for instance, the student was some- 
times unable to produce a technical term 
when its definition appeared on the post- 
test. The time data furnished indirect evi- 
dence that this analysis is correct, if it is 
assumed that the additional time spent by 
those who received the Standard program, 
must have been reading time and thinking 
time. More direct evidence was obtained 
from the questionnaire. Three open-ended, 
essay-type questions asked the student to 
comment on the procedure employed to 
complete the program, including any short- 
cuts used. Not counting those who said 
they sometimes read the answer through 
the page, there were 22 Ss who reported 
using prompts as shortcuts for filling 
blanks, distributed _ among treatment 
groups as follows: Standard Overt, 0; 
Standard Covert, 1; Heavily-Prompted 


SEC. PER FRAME 


pee a 2. ' 
tee al 4 a he a. 


STANDARD OVERT 
HEAVILY-PROMPTED OVERT 
STANDARD COVERT 
HEAVILY-PROMPTEO COVERT 


EXHIBITS 


8 
PROGRAM 
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SETS 


Fic. 1. Training time per frame. (Each point Tepresents pooled data from 
adjacent sets.) 


Overt, 11; Heavily-Prompted Covert, 10. 
The difference between the Standard and 
Heavily-Prompted versions was significant 
(x? = 22.8, df = 1, p < .001). The com- 
ments themselves were often informative, 
One S wrote, “When the first letters of 
the missing answer were given, it was diffi- 
cult to not answer [before] studying the 
question first, as the answers were obvi- 
ous.” Another commented, “I noticed that 
the underlined words were usually the 
answers so [I] often copied them before 
reading.” Still another explained, “By see- 
ing the first letter or letters of the word I 
immediately wrote down the word with- 
out understanding fully the written ma- 
terial.” 

The present experiment complements 
the important work of Holland and Kemp 
(1965; Kemp & Holland, 1966) who have 
maintained that the effectiveness of a pro- 
gram depends upon the extent to which 
responses are contingent upon attention to 
the “critical material” (presumably mean- 
ing the set of stimuli that we have called 
the cue) in the frames. They have de- 
veloped a procedure called the “blackout 
technique” to measure this contingency, 


The “blackout ratio” is the proportion 0 
material in a program that can be line 
through with a black crayon without af 
fecting error rate, Two undergraduates 
were hired to apply the blackout tech 
nique to the programs used in this bel 
ment. Beforehand they received about : 
minutes of training in the technique usil 
materials prepared by Holland and Kemp 
for this purpose. Neither rater was toga 
with the program or the subject matic 
both were ignorant of the fact that th “I 
were actually two programs instead of sn 
The first four sets of the program val 
processed. In order to control indivi on 
differences in tendency to eliminate ae 
terial, one rater evaluated the first ; 
third sets from the Standard program j 
the second and fourth sets from 7 
Heavily-Prompted version; the assign i” 
was reversed for the other dger” 
blacked out programs were not i 
determine if error rate had been 4 436% 
The blackout ratios were 35.5% a ‘ia 
for the Standard and Heavily-Pro Ea 
programs, respectively. These fi ie 
firm that the Heavily-Prompted P 

was less effective because the ; 


“OVERPROMPTING” IN ProgRAMMED INSTRUCTION 


undermined the contingency between the 
cue and the response. 

While prompting techniques undoubt- 
edly can be used to great advantage, this 
experiment has established one boundary 
condition for their use. Learning is reduced 
when the prompts are of such a nature that 
it is possible for the student to respond cor- 
rectly without attending to the cues. 
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Both ethnic group and sex were significantly related to teachers’ 
descriptions of ALB behGorn behavior of 153 7th-grade students; girls 
and Other Ss were described more favorably than boys and Negroes 
on 75% of 64 behavioral rating scales. There were no interactions of 
ethnic group and sex. Comparisons of IQ equivalent subgroups 
showed, however, that the effect of ethnic group tended to be con- 
tingent on scholastic aptitude and was not dependent on sex; higher- 
IQ Negro Ss were described as favorably as were higher-IQ Other Ss 
but the lower-IQ Negro pupil was more likely than the lower-IQ 
Other child to be described as maladjusted, verbally aggressive, and 
low in task orientation. Boys, patie of IQ or ethnic group, were 


described less favorably than gir! 


Little is known about the associates of 
teachers’ opinions of Negro and white 
pupils, although the importance of teach- 
ers’ opinions and the fact that they are re- 
lated to demographic variables such as sex 
and socioeconomic status appears to be 
well documented. 

It has frequently been reported that 
teachers are more likely to describe boys 
than girls as maladjusted or as behavior 
problems (Beilin, 1959; Goldstein & 
Chorost, 1966; Long & Henderson, 1966; 
Vroegh & Handrich, 1966) and that chil. 
dren from well-to-do families are more 
likely than are lower-class children to 
meet with approbation and success in 
school (Charters, 1963). This latter find- 
ing has been interpreted as an indirect 
effect of social class differentials in aca- 
demic preparation and opportunities (Sex- 
ton, 1961), but other studies have indi- 
cated that even among children of equal 
academic achievement who attend the 
same school, students whose parents are 


*The authors wish to thank Karen Pettigrew 
for statistical guidance and for computation of the 
tests for linearity of regression and Ann Drake 
for her assistance in the analysis and interpretation 
of results, 


described less favorably than are student 
from upper- and middle-class fami 
(Davidson & Lang, 1960). 

There is considerable evidence that “ 
dents who are described unfavorably 1) 
their teachers tend (a) to describe ye 
selves unfavorably, (b) to be ih 
their teachers’ poor opinion of bet : 
(c) to receive lower grades than stu ‘i 
whom the teacher describes om ly 
(Davidson & Lang, 1960; de oa 
Thompson, 1949; Fox, Lippitt, & a on 
1964; Goldblatt & Tyson, 1962). Fee 
agreement that teacher attitudes sit 
Negro children should be highly ee 
for their classroom behavior (Clat! ” a, 
Coleman, 1966; Deutsch, wi i 
1964; Riessman, 1962), Kata (19 He 
concluded that there has been n0 4 aa 
assessment of the attitudes of white 
ers toward minority-group pupils. taken 1 

The present analyses were unde snservl 
identify for use in planning an tate in | 
teacher education program to at aah 
tegration, the extent to which are 
bles of sex and scholastic ability as of | 
sociated with teachers’ do 
Negro and Other-than-Negro stude 


semi-skilled, unskilled, or unemployed 
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Mertuop 


"Subjects 


The subjects (Ss) were selected from seventh- 
grade classes in a northern Virginia suburban 
community which had integrated its schools in the 
year of the study. Ethnic group membership was 
inferred from attendance at segregated schools dur- 
ing the prior year. By this criterion, 9.8% (199) of 
all seventh-grade students were Negro and 90.2% 
(1931) were Other-than-Negro. (These students 
will hereafter be referred to as Others) The 
small total population of Negro students pre- 
cluded selecting proportional random samples of 
Negro and Other Ss. In order to include as many 
Negro Ss as possible within the practical limits of 
teacher contact, the junior high schools with the 
largest numbers of Negro students were identified. 
Righty-nine percent of the Negro students at- 
tended three of the system’s six junior high schools. 
It was not possible to estimate social-class mem- 
bership directly for each student; however, two of 
the schools were in neighborhoods judged as lower 
class (mixed residential and commercial buildings, 
low-cost housing, poor upkeep of buildings) and 
one, by the same criteria, was judged to be lower 
middle class. As these schools draw students from 
neighborhoods surrounding them, the social class 
of the student group may be at least generally in- 
ferred. In the two schools judged to draw lower- 
class students, 24.5% and 27.7% of the seventh- 
grade students were Negro, In the third school, 
which was judged to draw lower middle-class stu- 
dents, 9.8% of the seventh-grade students were 
Negro. A sample of 100 Negro and 100 Other Ss 
was chosen by a table of random numbers from 
the total of 177 Negro and 805 Other students 
attending these three schools. The students were 
not selected with consideration as to sex. 


Procedure 


; One teacher was selected at random from each 
8's schedule card. Both to conserve teacher time 
and to avoid, insofar as possible, biasing the data 
with respect to individual differences among raters, 
Qo teacher was asked to evaluate more than five 
students. In the “lower middle-class” school, all 
17 seventh-grade teachers were included in the 
survey; in the other two schools, 76.7% (23 out 
of 84) of the seventh-grade teachers participated. 
Twenty of these 40 teachers taught general educa- 
tion courses, five taught mathematics, and the re- 
Maining 15 taught subjects such as physical edu- 
cation, language, music, and art. Thirty-six of the 
teachers were white and 30 were women. Class 
assignment in the system, and therefore selection 


_ 'The community, while predominantly white, 
includes a variety of ethnic groups. No direct, in- 
formation on family background can be obtained 
for the 8s, The students who had not attended a 

egro school are therefore described as “Others” 
tnd the variable will be referred to as ethnic 
Soup rather than race. 


¥ 


for inclusion in the sample, was random with re- 
spect to pupil and teacher ethnic group and sex. 

y The teachers were contacted by mail, The cover- 
ing letter stated that the intention of the study 
was to standardize the instruments and requested 
the teacher’s cooperation in describing the ad- 
justment and classroom behavior of the five stu- 
dents selected for him or her. 

The questionnaires were distributed in Novem- 
ber 1965. Of the 200 eligible students, 178 were 
rated and 153 records were sufficiently complete 
to be included in the analyses, There were signifi- 
cant differences in the proportion of records re- 
turned by ethnic group (92% returns for Other Ss 
as compared with 61% returns for Negro Ss), but 
not by sex. Nonreturns were due primarily to in- 
sufficient teacher time and to the fact that 11 
pupils had left the school system between the time 
of selection and rating. 


Measures 


The teachers were asked to rate the students’ 
adjustment on the following scale which was de- 
veloped by Ullman (1952) and modified by Glide- 
well, Domkee, and Kantor (1963). The first two 
categories of Glidewell et al.’s scale were revised 
to emphasize social rather than academic accom- 
plishment as the criterion of adjustment. 

1, Well adjusted. A happy child who is well 
adjusted in his relationships with others and in 
his activities. 

2. No significant problems. A child who gets 
along reasonably well and has little or no diffi- 
culty adjusting to others or to classroom ac- 
tivities. 

3, Subclinically disturbed. A child who is not 
so happy as he might be; has moderate diffi- 
culties getting on; and to whom growing up rep- 
resents something of a struggle. 

4. Clinically disturbed. A child who has, or at 
his present rate is likely to have, serious prob- 
lems of adjustment, and needs clinical help be- 
cause of such problems. f 
It will be noted that with this instrument the 

teachers were not asked to identify students who 
presented problems in classroom management; 
attention was rather directed toward a more 
clinical definition of social and emotional adjust- 
ment. 

The teachers were then asked to rate each of 
their students on the Classroom Behavior Inven- 
tory, (CBI), a recently developed 320-item ques- 
tionnaire (Schaefer, Aaronson, & Burgoon, 1966). 
The questionnaire items were intended to describe 
behavior and to reduce as much as possible in- 
ferences about motives and feelings. Sample items 
included: “Often disagrees with what others sug- 
gest,” “Brags how he is able to outwit others,” 
“Begins work at once, a8 Soon as something is 
assigned,” “Seldom talks to other children before 
or after class,” “Sticks to old ways of doing things; 

s to make changes.” i 

atthe teacher was asked to describe the behavior 
of each child for each item, with the following re= 
sponse options: 1. Not at all like the child, 2. Very 
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little like the child, 3. Somewhat like the child, and 
4, Very much like the child. The specific instrue- 
tions were: 

Please give a response to every item and base 
your response upon your personal observation 
and experience with the pupil. In the case of 
items relating to behavior which you have not 
observed, respond as you would expect this 
child to behave as a general rule. 

There are 64 five-item scales, Scale reliabilities 
for the sample of 153 Ss as estimated by Kuder- 
Richardson formula 20 ranged from .73 to .96. The 
median internal consistency scale reliability was 
86. A principal components analysis, Varimax 
rotation, yielded three factors, Scales describing 
perseverance, conscientiousness, concentration, 
achievement orientation, academic seriousness, and 
methodicalness had loadings of .76-.86 on Factor 
I, “Positive task orientation.” Irritability, argu- 
mentativeness, attention seeking, 
quarrelsomeness, and dominance had loadings of 
.86-.93 on Factor II, “Verbal aggression.” Active 
helpfulness, cheerfulness, and gregariousness had 
high negative loadings and social withdrawal, de- 
pression, and emotional passivity had high positive 
loadings on Factor III, “Introversion-extroversion.” 
Adjustment ratings correlated 43, 43, and —.48, 
respectively, with the three factors. The average 
scores of each S were computed for the six scales 
with the highest loadings on each of the three fac- 
tors. (Due to computer limitations, true factor 
scores could not be computed. The average scores 
would be expected, however, to correlate highly 
with the true factor scores.) 


Resvits 

The distribution of adjustment ratings 
by sex and ethnic group is shown in Table 
1. Only 4% of all Ss were described as 
clinical problems, 33% of the students 
were described as very well adjusted, 40% 
were described as presenting no problems 
and 23% were described as subclinical 
problems, a distribution similar to that re- 
ported by Ullman (1952) for ninth-grade 
white students. Clinically and subelinically 
maladjusted categories were pooled for the 
following 2 X 3 comparisons. Girls were sig- 
nificantly more likely than boys to be rated 
as well adjusted (x2 = 15.24, p < .01) 
whether the students were Other 62 = 
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TABLE 1 
PERCENTAGES OF OTHER AND NEGRO, Maz 
Femate Seventa-Grape Srupents Raail 
ApsusTED and Matapsusrep py 
Terr TEACHERS 


Teacher ratings of adjustmen, 

Race and sex N 
No 

vel | gob 
Other 
Male 42 | 31% | 33% 
Female 47 | 538% | 32% 
Negro 
Male 31 | 16% | 35% 
Female 29 24% | 66% 
Total 149 83% | 40% 


6.43, p < .05) or Negro (x? = 12.64, p< 
01). Other students were more likely tha 
Negro students to be rated as well adjustel 
if the students were girls (y* = 8,40,9¢ 
.05), but not if the students were boys (y'= 
2.30, p > .50). 

Among Other students there was tl 
significant relation between IQ and adjust 
ment (r = —.05, p > .10), whether the 
students were girls (r = —.16, p > .10) 0 
boys (r = .06, p > .10), There was ™ 
relation between scholastic aptitude 
adjustment for Negro girls (r = —.01,P?) 
10) but the less able Negro boys Wa) 
more likely than were the brighter Neg! 
boys to be described as subelinically z 
clinically disturbed (r , 2 il 
01). Scholastic aptitude is thus chown 
be a significant associate of teachers 1 
ings of adjustment only with Negro bit 


Analyses by Ethnic Group and Sex fort 
Whole Sample | 


A two-way unweighted means ao 
of variance (Winer, 1962) was fe j 
for ethnic group and sex for eae i 


64 scales. On 48 of the 64 a ret 


significant at the .05 level. On 


sig { 
seale, “Work fluctuation,” there, a ! 
nificant Ethnic Group x Sex inter 


The mean IQ for girls, 101.4, did for bo | 
significantly from the mean I 


Raines or ApsusTMENT AND Ciassroom BrHAvior 97 


100.2; the mean IQ for Negro Ss, 87.5, was 
significantly lower than the mean IQ, 
113.0, for Other Ss. While scholastic apti- 
tude would not seem to account for the 
differential description of boys and girls, 
the characteristics attributed to Negro Ss 
as compared to Other Ss might be as- 
sociated with lower scholastic ability 
rather than with ethnic group per se. 

There are considerable methodological 
difficulties in isolating the variance due to 
ethnic group and sex from that due to 
scholastic aptitude because of the low 
overlap in the CMMT distributions and 
the asymmetry of IQ/adjustment and 1Q/ 

CBI scale correlations. As an example, 
“submissive” correlated —.30 with CMMT 
IQ for Other girls and +.40 for Negro 
boys; “methodical” correlated .51 and .40 
with IQ for Negro girls and Negro boys, 
but for Other boys, r = .01. Analysis of 
covariance was not appropriate since, ex- 
cept for Negro Ss, there was no reliable 
evidence of a linear relation between the 
dependent, variables and scholastic apti- 
tude as measured by the CMMT. Despite 
restrictions on generalizations to the upper 
end of the Other IQ distribution, the most 

defensible approach seemed to be a three- 
way analysis of variance for ethnic group, 
sex, and IQ. 


: Analyses by Sex, Ethnic Group, and Scho- 
lastic Aptitude for IQ Equivalent Sub- 
samples 


The overlap between Other and Negro 
-CMMT distributions ranged from 1Q 68 
to 114. Other students with IQs above 115 
(N = 87) were dropped from these analy- 
ses in order to facilitate matching IQ 
groups; all Ss with IQs below 68 (N = 6) 
were also dropped. “Higher” Ss were de- 
fined by 1Qs between 99 and 114; “lower” 
8s were defined by IQs between 68 and 98. 
Two-way unweighted means analyses of 
Variance (ethnic group and scholastic abil- 
ity) were computed for each of the 64 
Seales for the 1Q-selected samples of boys 
and girls. Of the 128 F’s, 14 were signifi- 
Cant for ethnie group at <.10 level, 21 
Were significant for scholastic aptitude, 
and 27 of the interactions were statistically 


TABLE 2 
Means anv Ns or Teacuer Ratines on Factor 
Scorzs: I, Task Ortentation (TO); II, Vara 
Aceression (VA); Exrroversion (EB) anp 
IntRoversion (I) py ScHonastic ApqirupE, 
Ersnic Group anp Sex For 
SevenrH-Grave SrupEnts 


Factor score means 


® CMMT IQ (very high, 115-140; high, 99-114; low, 68-98), 


reliable. For expository simplicity we will 
describe the results primarily in terms of 
the 2 xX 2 X 2 unweighted means analysis 
of variance completed for average scores. 
To facilitate interpretation of the intro- 
version-extroversion scores, the scales which 
measured introversion and extroversion 
were considered separately. The means for 
the eight cells for each of the four average 
scores are shown in Table 2. 

Sex. As Table 3 indicates, on three of the 
four average scores, sex accounted for a 
significant portion of the variance: girls 
tend to be rated higher than boys in task 
orientation and were less likely than boys 
to be described as either verbally aggres- 
sive or as high in introversion; boys, re- 
gardless of ethnic group, were more likely 
than girls to be described as withdrawn, 
asocial, and emotionally passive. To see if 
the effect of sex was linear, analyses of 
variance were computed on factor scores 
for the Other students only by IQ (high- 
est, higher, and lower) and sex: again, re- 
gardless of 1Q, boys were described as 
more withdrawn, asocial, and emotionally 
passive than were girls. For the Other 
students only, with the highest 1Q Ss in- 
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TABLE 3 
F Ratios ror Unwetcurep Mzans ANALYSES OF 
VaRIANcE By SCHOLASTIC AptitupE, Ersnic 
Group, anp Sex Compurep ror AVERAGE 
Scorus* Derivep rrom TeacuErs’ RATINGS 
oF THE CLASSROOM BEHAVIOR OF 
Srventa-Grape SrupEents 


F ratios for factor scores 

Source 
TO VA E I 

Scholastic 
aptitude rie $5 ee 

8 pene aoe 14.27 roa | 11.33% 

AXE 482 q Pe By 
BxXe 0:88 ¥ 100 180 
AXBXC 14d 0.07 3.20° | 03 


Orientation (TO); II, Verbal Aggression (VA); III 
Paterna Teen ODE Le aeerenica ey aa 
Tntroversion (I), 
te 10. 
a 


cluded, girls were significantly more often 
described as cheerful, gregarious, and ac- 
tively helpful. Sex did not interact with 
ethnic group or scholastic aptitude on any 
factor except extroversion (at the .10 
level), 

Scholastic aptitude. Scholastic aptitude 
did not account for a significant portion of 
the variance on any factor, 

Ethnic group. Ethnic group accounted 
for a significant portion of the variance on 
task orientation and extroversion. Negro 
students were more likely than were Other 
students to be described as low in task 
orientation and, at the .10 level, were less 
likely to be described as helpful, cheerful, 
and gregarious. For two factors, task orien- 
tation and verbal aggression, the interac- 
tion of ethnic group with scholastic apti- 
tude was significant, On task orientation, 
the difference between Negro and Other 
students was greater among lower-IQ than 
among higher-IQ Ss, regardless of sex. On 
verbal aggression, the effect of ethnic 

group depended on scholastic aptitude: 
among brighter students, Negro Ss were de- 
scribed as less verbally aggressive than 
were Other Ss, while among low-IQ stu- 
dents, Negro Ss were described as more 
verbally aggressive than were Other Ss. 
The interaction of sex, ethnic group, and 


L. Darma, E. Scuazrer, anp M. Davis 


scholastic aptitude was Significant af 
.10 level for extroversion; the difference hy 
tween boys and girls, and Other and Ny 
students depended mainly on the low 
erage rating of 11.6 received by low. 
Negro boys and the high ratings of 154 
and 15.5 received by high- and very high, 
IQ Other girls. | 
The results of these analyses and ol) 
analyses of the 64 individual scales for th 
IQ equivalent subsamples are summarita 
in the following section. i 
1, Among Negro students, low-IQ § 
were more frequently described as low by 
adjustment, low in task-orientation, verb 
ally aggressive, rebellious, asocial, and uy 
ruly. They were seen neither ag well be 
haved nor as studious. Higher-IQ Negi 
Ss were likely to be described ais_tath 
oriented, methodical, persevering, ‘Sociablk 
trustful, submissive, and as low in rebel) 
liousness and verbal aggression. They wett 
seen as well behaved, hardworking, stul 
ous pupils. ; 
2. Among Other students, low-IQ 
were more frequently described as task 
oriented, low in verbal aggression, a 
pliant, cooperative, and considerate, Tht 
CMMT and reading achievement scored 
suggest that they are not achieving ee 
the high level of academic and social of . 
described: not unpredictably, low-1Q Olid 
Ss were also seen as lower in el 
and as more tense and fatigued be 10 
the other three subgroups. The higher *) 
Other Ss were likely to be described mt 
oriented, verbally aggressive, inqui 
enthusiastic, and as leaders. ie 
In the IQ equivalent subsamp i ‘io 
ethnic group is associated with dese kes i; 
of classroom behavior, but the i Fs 
the association tends to be one ae 
the scholastic aptitude of S aeriptioe 
primarily to the unfavorable - vow 
of Negro as contrasted to Ot re 
students, particularly Negro y" 7 the 
classroom behaviors describe 4 
teachers suggest that the low 
student and the higher 1Q Neste t, 
tend to cope with the deman sh v0 
by working hard and carefully ® manage” 
fering few problems in classroom *"” 


student 
choo! 
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ment. The low-IQ Negro student appears 
to have resigned from the educational proc- 
less and is seen as poorly adjusted, unruly, 
and uninvolved, behaving in ways gen- 
erally considered to be associated with 
educational failure and classroom manage- 
ment difficulty. The data suggest that the 
Other, higher IQ student may be freer to 
adopt a mode of response characterized by 
exploration, dominance, independence, and 
academic interest. 

3. Regardless of ethnic group or scho- 
lastic aptitude, boys were more likely to 
be described as hyperactive, asocial, verb- 
ally and physically aggressive, and tense 
and were less likely to be described as 
friendly, methodical, persevering, task 
oriented, and well adjusted than were girls. 
They were not likely, however, to be de- 
scribed as lower in such traits as en- 
thusiasm,  inquisitiveness, leadership, 
verbal expressiveness, academic ability, 
nor as higher in conformance. 


Discussion 


One important finding of this study is 
that the unfavorable description of the 
Negro student is associated primarily with 
Ss of lower scholastic aptitude. At least 
two questions may be raised concerning 
the interpretation of this finding, in addi- 
tion to the limitations imposed by the re- 
stricted range of scholastic aptitude and by 
the small Ns: (a) is it an artifact of so- 
cial class and (b) are teachers’ descriptions 
observations of actual behavior or per- 
ceptions that would be considered biased or 
limited in comparison to what other ob- 
Servers might report? 

_1. We do not know whether the interac- 
tion among scholastic aptitude and ethnic 
group is associated with Negroes in par- 
ticular, with minority groups, or more 
generally with social class. The description 
of the low-IQ Other student suggests some 
effort by the child to conform to the de- 
mands of middle-class parents for good 
grades and good behavior in school while 
the description of the low-[Q Negro stu- 
dent does not and seems more consistent 
with behavior generally attributed to chil- 
dren from low-income families. 


Schmuck and Luszki (1966) have re- 
ported that in a small, midwestern com- 
munity, there were no differences in achieve- 
ment, self-ratings, and teachers’ classroom 
behavior ratings when socioeconomic sta- 
tus was carefully matched for Negro and 
white students. They conclude that rela- 
tions among race, self-esteem, and achieve- 
ment are confounded in other studies with 
social class. Only 63 pairs of students, 
ranging in age from 8 to 16 years, were in- 
volved in the study; a larger sample may 
be needed at each age and grade level to 
test the social-class interpretation rigor- 
ously and the nature of the community 
might itself be a relevant variable (cf. 
Davidson & Lang, 1960). In our sample, it 
is possible that despite the somewhat 
homogeneous neighborhoods, Other and 
higher-aptitude Negro students came from 
less-deprived homes than did lower-apti- 
tude Negro Ss. 

2. We have referred to teachers’ de- 
scriptions rather than to either teachers’ 
perceptions or students’ behavior. The 
teachers had access to intelligence test and 
reading scores and knew the students’ 
ethnic group and sex. Whether in this in- 
stance the frequently postulated interac- 
tion between expectations and observations 
is weighted more heavily with expectations 
or was formed by observation relatively 
independent of teachers’ a priori values, is 
moot. 

Considering the “observation” interpre- 
tation, results similar to ours have been. 
noted for younger children whose teachers 
had volunteered for the assignment. Such 
teachers might be expected to be somewhat 
more favorably disposed toward the chil- 
dren than public school teachers assigned 
to schools in low-income Negro neighbor- 
hoods. Lamb, Ziller, and Maloney (1965) 
found, for example, that white girls gained 
most from Headstart experiences and that 
the Negro boy was both least favorably 
described by his teacher and least likely 
to benefit from the preschool program. The 
description of the brighter Negro students 
as more “compliant” is congruent with the 
report that in comparison to white liberal- 
arts college students, Negro liberal-arts 
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students scored higher on deference and 
lower on exhibitionism, autonomy, and 
dominance on the self-descriptive Edwards 
Personal Preference Schedule (Pettigrew, 
1964). 

Considering the “opinion” interpreta- 
tion, Rosenthal (1966) and Flowers (1966) 
have shown that students’ IQs and class- 
room performance tend to increase when 
teachers are led to believe the child’s 
intellectual potential is high relative to an 
equally bright control S whom the teacher 
believes has lower intellectual potential. 
Their effects have been demonstrated, 
however, more reliably in younger than 
older children. Rotter’s (1967) studies 
also suggest that “.... preconceptions in- 
fluence one’s perceptions and evaluations 
and that these might lead to differential 
treatment.” Groups of white teachers read 
vignettes reporting a child’s behavior. 
Analyses are reported for vignettes which 
differed in the sex ascribed to the child and 
classroom behavior: e.g., orderly/dis- 
orderly, “Ann carried a _ classmate’s 
books”; “Billy carried a classmate’s books.” 
Sex and the interaction of sex with class- 
room orderliness were significant associates 
of teacher’s rating on many of 80 bipolar 
scales, including a rating of boys as “dirt- 
ier.” Race (white and Negro) and social 
class (middle and low) were also variables 
in the study and the technique should be of 
considerable value in separating “opinion” 
and “observations” in the descriptions. of 
minority children by majority teachers, 

Davidson and Lang’s comments (1960, 
p. 114) on the antecedents of scholastic 
difficulty may be relevant here: 


It is likely, therefore, that a lower class child, es- 
pecially if he is not doing well in school, will have 
a negative perception of his teachers’ feelings to- 
ward him, These negative perceptions will in turn 
tend to lower his efforts to achieve in school and/or 
increase the probability that he will misbehave. 
His poor school achievement will aggravate the 
negative attitudes of his teachers toward him 
which will in turn affect his self-confidence, and 50 
on. This vicious entanglement must be interrupted 
at some point. The point of attack may well be the 
teacher whose capacity to reflect feelings con- 
ducive to the child’s growth should be of concern 
to educators. 


To this we would add that both the rel- 


atively high correlation between 1Q and 
adjustment for Negro males and the analy. 
sis of variance results suggest that the loy, 
IQ Negro is alienated from the schoo] situ. 
ation, that is, is not task oriented and it 
verbally aggressive and withdrawn, Thiy 
finding supports the need for programs dx 
signed to raise the level of intellectual 
performance before the vicious cycle of loy 
achievement, teacher rejection, and chili 
alienation begins. 

A second major finding in this study is 
that boys were described as less task 
oriented, more verbally aggressive, and 
more introverted than girls. The latter 1 
sult is unexpected as girls have been gen- 
erally described as less outgoing and mor| 
introverted than boys. Our initial interpre 
tation of an introversion-extroversion andl. 
ysis was that helpful classroom behavior, 
“extroversion,” might be accounting for 
most of the variance: clearer sex differ 
ences were found, however, for the “it 
troversion” than for the “extroversion’ 
factor scores. ’ 

Interpretations of the sex differences 
personality and attainment have range 
from biological forces to a greater dis. 
parity for boys than for girls between the 
classroom demands of female teaches 
and socially defined behavior appropriate 
to the student’s sex. Maccoby (1966), ® 
viewing this literature, has noted that pet 
group pressures on boys are often dired 
to nonacademic pursuits; 


that boys are more frequently engaged in Oe 
to achieve autonomy, especially in relation 0 
mothers, with the result that they are pen) 
to accede to the demands of their predom che 
female teachers; and that even in ie ie tt 
boys are more likely to do poorly in subi 
bore them [p. 32]. ; 
sand 
The observed higher “ntroversion 
ings for boys may thus indicate ie ted 
a traditionally academically 
classroom situation rather than 
general trait. 
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LEARNING ABILITIES OF NORMAL AND RETARDED 
CHILDREN AS A FUNCTION OF SOCIAL CLASS! 


JACQUELINE L. RAPIER? 
University of California, Berkeley 


This study compared the learning ability of normal and retarded ele- 
mentary school children (N = 80) from high- and low-SES back- 
grounds on a series of learning tasks. On the first day, all Ss learned a 
serial list and a paired-associate list. 24 hr. later, Ss were divided into 
an experimental and a control group on the basis of CA, 1Q, and 
SES. The experimental group learned a 2nd list of paired-associates, 
under conditions of mediation, that is, sentences were provided linking 
the pairs on the Ist trial. The control group learned paired associates 
without instruction in mediation. 1 wk. later, all Ss learned a 3rd list 
of paired associates. Results showed IQ differences in performance in 
both SES groups on serial and paired-associate learning. A significant 
mediation effect was found on the 2nd day, but this effect did not 
transfer to the learning of paired associates 1 wk. later in any group. 


However, over the 3 lists of paired associates, an increasing superiority 
in performance was found for the low SES retardates as compared to 


the high-SES retardates. 


Sarason and Gladwin (1958) note that a 
large number of retardates are from the 
most economically and socially under- 
privileged families in society and suggest 
that the impoverished environment has 
provided minimal opportunity for the 
learning of skills which are tapped by cur- 
rent intelligence tests. Since these retar- 
dates usually are not identified until they 
enter school and show difficulty in dealing 
with verbal tasks, Sarason hypothesizes 
that language skills receive little stimula- 
tion in their home environment, Diagnosis 
is complicated by the fact that retardates 
are usually most deficient in the verbal 
area, so it has become a problem to differ- 
entiate the organically retarded child (or- 
ganic retardation) from the child who 
appears retarded due to an early history 
of environmental deprivation (cultural re- 
tardation). 

Jensen (1967) suggests the use of a 
variety of direct learning tests to assess the 
child’s disability rather than employing 


* This report is based on a doctoral dissertation 
Bee to the University of California, Berke- 

ley. 

* Now with the Palo Alto Unified School Dis- 
trict, Palo Alto, California. The author is in- 
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the Chairman, Arthur R. Jensen and to a number 
of school districts in the San Francisco Bay Area 
who cooperated in the study. 
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standardized measures of past achieve} 
ment. A series of experiments by Jenset 
and Rohwer (1963a, 1963b, 1965) shovel } 
that serial and paired-associate leaming 
represent quite different levels of learning. 
Although both tasks involve rote learning 
skills, paired-associate learning appears! 
be more complex due to the more impor 
tant role of verbal mediation. Experiments 
which directly instructed subjects a 
use verbal mediators showed greatly facil) 
tated paired-associate learning, but verbal 
mediators did not influence the rate of set 
learning. Among normal children, ability 
paired-associate learning increases vi 
age up to 18 years, presumably due to ; 
increasing use of verbal facilitative devie n 
There appears to be no similar ine . 
the ability to learn a serial list ee is 
8, which Jensen suggests is relat ate 
lack of dependence on verbal meda 
On the basis of this research, Jen 
poses that serial learning more er 
measures learning ability relatively 
fected by S’s previous verbal pee, eat 
while paired-associate learning 18 i 
ent on the richness of $’s verbal oe att 
and on the availability of relevat 
mediators. 5 tg 
Various investigators (Griffith, sr) 
Lipman, 1959; Jensen, 1965; Rie ie ver 
agree that retardates do not emp! 
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bal mediators as effectively as normals. 
From an educational point of view, it is 
important to know if the retarded child is 
unable to elicit verbal associations as 
mediators because he has not had the op- 
portunity to form necessary associations 
prior to the learning situation or because he 
has a neurological deficit in association 
‘ormation. Obviously, the two types of 
slow learners should have quite different 
types of educational treatment as the 
former may be able to overcome his 
handicap through remedial training while 
the latter requires more appropriate educa- 
ional goals. 

Jensen (1967) suggests that the cultur- 
ally retarded child will learn a serial list 
more efficiently than a paired-associate 
ist as serial learning does not depend as 
much on the richness of the child’s early 
language experience. The organically re- 
tarded child will be slow in both serial and 
paired-associate learning even if his en- 
vironment has been good, because of a 
basic deficiency in the neural equipment 
for learning. 

The present experiment was designed to 
explore the relation of serial and paired- 
associate learning to IQ and socioeconomic 
status (SES). The specific hypotheses are 
as follows: 

1, Since it is hypothesized that serial 

learning is relatively unaffected by verbal 
mediation but does reflect learning ability, 
1Q will predict serial learning ability more 
effectively in the high- than in the low- 
SES groups. 
_ 2. Paired-associate learning, which is be- 
lieved to involve verbal mediational proc- 
esses to a greater extent than serial 
learning, will closely reflect 1Q differences 
in both high- and low-SES groups. 

.3. The normal and retarded groups from 
high- and low-SES will show increased 
speed of paired-associate learning after 
verbal mediation instruction as compared 
Ni control groups not receiving instruc- 
jon, 

4, Transfer of the mediation technique 
to a new paired-associate task 1 week later 
should be greater for the normal than for 
the retarded groups in both SES groups. 


However, the retarded low-SES group will 
show greater transfer of mediation tend- 
encies than the retarded high-SES group. 


MetuHop 


Subjects 


Eighty white American Ss were selected from 
public elementary schools in Alameda and Contra 
Costa County, California. The age range was 91 
to 154 months. The Ss were equally divided be- 
tween two socioeconomic classes, an upper and a 
lower class. The Ss were classified as high SES if 
their father was in a professional, semiprofessional 
or managerial occupation and had completed 2-4 
years of college. The Ss were placed in the low-SES 
group if their father was engaged in unskilled or 
semiskilled work and had not gone beyond high 
school. There were two IQ levels in each SES 
group: a normal IQ group and a retarded IQ 
group. The normal group was selected on two 
bases: an IQ of 100-110 on a group test, the Kuhl- 
man-Anderson Intelligence Test, and average 
achievement for grade placement on the Stanford 
Achievement Test. The retarded group was se- 
lected on two bases: an IQ score of 63-78 on a 
recent (not more than 2 years) Stanford-Binet 
Intelligence Scale and placement in a special class 
for educable mentally retarded children. It was 
not possible to obtain individual test scores on the 
normal group. The Kuhlman-Anderson is reported 
to measure substantially the same thing as the 
Stanford-Binet (Cronbach, 1960). Furthermore, it 
is rare that a child who receives an average IQ 
score on a group test and achieves at grade level 
would score below average on an individual test. 

Since the study rests on the hypothesized rela- 
tionship of verbal ability to SES, additional infor- 
mation was sought on S’s verbal development. 
Vocabulary is a widely used measure of language 
development, and it is susceptible to environ- 
mental stimulation (McCarthy, 1954), The Pea- 
body Picture Vocabulary Test (Dunn, 1965) is 
an individual picture vocabulary test where 8 
is required to choose the picture on the plate which 
best illustrates the meaning of the word provided 
by the experimenter (2). The Peabody seemed 
preferable to the standard vocabulary test where 
must define the meaning of the word since the 
latter depends on 4's ability to express his ideas 
verbally. Retarded Ss may comprehend the mean- 
ing of the word, but lack the fluency of speech to 
provide a satisfactory definition. 

The Peabody was administered to all Ss. The 
high-SES group had a mean raw score of 78.55 and 
a mean MA of 127.65 months, The low-SES group 
had a mean raw score of 72.37 and a mean MA of 
112.37 months, The MA difference between SES 
groups is significant (F = 7.18, df = 1/72, p< 01), 
indicating that SES is related to verbal ability in 
this population. , 

The Ss were assigned to one of the four groups 
on the basis of their age, IQ, and SES. There were 
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TABLE 1 
CHARACTERISTICS OF THE Four Groups 


Group (maths) | (months) | 22 

High-SES 124.40 | 130.75 | 105.10 
normal Range | 94-147} 99-152 | 100-110 

SD 18.48 | 19.87 3.70 

Low-SES Mean | 126.10 | 131.55 | 104.50 
normal Range | 96-149} 103-155 | 100-110 

SD 16.63 | 16.29 3.23 

High-SES Mean | 123.95} 88.60 71.45 

retarded Range | 92-154) 60-115} 63-78 

SD 19.95 | 15.48 4.95 

Low-SES Mean | 123.90 | 86.60 70.20 

retarded Range | 91-1, 68-103 | 63-78 

SD 17.21 | 10.25 3.64 


20 Ss in each group. The ratio of boys to girls was 
kept approximately equal in each of the four 
groups, 15 boys to 5 girls. Children whose health 
records indicated sensorimotor disabilities or emo- 
tional disorders were excluded from the study. 
Table 1 presents information on the characteristics 
of each group with the means, standard devia- 
tions, and range for CA, MA, and IQ for the 80 Ss. 
When attrition occurred within one of the four 
groups, an additional S was drawn at random from 
a pool of Ss and assigned to that particular group. 
Four Ss from the high-SES retarded group and two 
from the low-SES retarded were eliminated due to 
failure to reach the criterion on original learning. 
In addition, five Ss were replaced when they were 
absent on the second day or final week of the ex- 
periment. 
Design 
Day 1. All Ss were given two tasks, serial and 
paired-associate learning. In serial learning, a dif- 
ferent order of pictures was used for each S in a 
group but the same 20 different orders were re» 
peated in each group. In paired-associate learning, 
the position of the pairs was randomly changed 
from trial to trial by shuffling the cards between 
trials. In order to avoid practice and fatigue ef- 
fects, half the Ss learned the serial task first and 
then the paired associates; the other half learned 
he paired-associate task followed by the serial 
Day 2. The second testing session occurred 24 
hours later. The Ss were assigned to an experi- 
mental group and a control group on the basis of 
their CA, IQ, and SES. The experimental and con- 
trol groups learned a new list of paired associates 
under different conditions of instruction, The ex- 
perimental group received instruction in the use 
of verbal mediators (mediation) while the control 
group did not receive such instruction (nonmedia- 
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tion). The paired-associate list presented on Day | 
1 was counterbalanced with the Paired-associaty 
list presented on Day 2 since one could not safely 
assume the two lists were equivalent in di i | 
Day 3 (1 week later). All Ss learned a third lis, 
of paired-associates under the same nonmediation 
conditions. j 


Stimulus Materials 


The stimulus material of both the serial and 
paired-associate tasks consisted of black and white 
pictures of common objects, for example, ball 
house, table. Pictures were cut from preprimet 
workbooks and mounted on gray cardboard, Ther 
were four sets of pictures, each set using different 
pictures. One set used for serial learning consisted — 
of nine pictures, each mounted on 4 X 4-inch cane 
board. Three sets used for paired-associate leam- | 
ing consisted of nine pairs of pictures on 5 X7- 
inch cards. On one side a single picture appeared, | 
and on the other side of the card, the same picture 
was paired with another picture unrelated to the | 
stimulus picture. Paired associates were made from 
pictures paired at random, but obvious relation 
of sound and meaning were avoided between 
members of each pair. 


Method of Stimulus Presentation 


The Ss were tested individually in an unused 
room in the school building. Testing conditions 
were reasonably comparable in terms of extraneols 
stimuli. The § was seen at about the same time for 
each session. ee, 

During the experiment, Z sat at a table facing 
8. On Day 1, F gave the following instructions for 
serial learning: 


We're going to play a short game. See the | 
cards in a row? When I turn the card over, there | 
is a picture on the other side. Name each picture | 
as I show it to you. [The S named each picture 
as F turned over the card.] Now, I want you 
learn the names of all the pictures in this BS 
When I point to the card, you tell me what Me ; 
think is on the other side. Then I will tum — 
card over and you can see if you are right. 


Immediately following serial learning, B preset 
the paired-associate task. 
this (2 


showed one stimulus picture on a sample ae 
then T’'ll turn the card over like this, and Yo"; 
see the same picture with another one 0 
like this. I want you to say the name 0 
picture next to the first one. Let’s see how 
you can learn which pictures go together. g was 
presented the series of nine cards an etute 
asked to name the stimulus and response P j 


in each case.] jum 
0 


On subsequent trials, S was only required stimulls 
the response picture when he saw the or 20 
picture. Whether S made a correct respon seo the 
the card was turned over so that S coul 
stimulus and response pictures side by side. 
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On Day 2, two conditions of instruction were 
used—mediation and nonmediation. In the media- 
tion condition, a standard set of sentences linking 
the stimulus with the response was provided by E 
upon initial presentation of each pair. Providing 
a standard set of sentences seemed preferable to S 
making up his own sentences due to wide individ- 
ual differences in the ability among Ss to invent 
their own phrases. The S was asked to repeat the 
sentence after H. After the first trial, the proce- 
dure was essentially the same as in the previous 
day, learning of paired associates, that is, S had to 
anticipate the response term when the stimulus was 
shown. The S was discouraged from repeating the 
sentences after the first trial. 

The standard set of sentences given by E for 
the first set of paired-associates were: (a) The 
sPooN falls out of the nest. (b) I stuck the Fag 
inside my sHox. (c) The coms dropped under the 
cHatr. (d) I carried the BASKET inside the HOUSE. 
(e) My HAND winds the crock. (f)The scissors 
cut the tear. (g) The mon takes a ride on the 
BIcycLE. (h) The TREE grows inside the cup. (i) The 
GLASSES are eaten by the FISH. 

The phrases for the second set were: (a) The 
CAR runs over the BALL. (b) The pesk hides some 
Money. (c) The Fire burns the saw. (d) The 
Piano is behind the coat. (e) The Key locks the 
wacon. (f) The Boat scares the cat away. (g) The 
Box holds a TELEPHONE. (h) The CANDLES ride on 
the Horsr. (4) The Bus breaks the RING. 

In the nonmediation condition of paired-associ- 
ate learning, H asked S to name the stimulus and 
response terms on the first trial. After this, the 
procedure was the same as in the mediation con- 
dition. 

On Day 3, the third list of paired associates was 
presented to all Ss using the same procedure as 
in the first presentation on Day 1. At the conclu- 
sion of this session, all Ss were asked if they had 
used any special method to help them learn the 
list of paired associates. 

Criterion of learning on all four tasks was eight 
out of nine correct responses on any one trial. All 
tasks were S paced. The S was dropped from the 
Pouce. if he failed to reach the criterion within 

rials, 


Rasvuits 


The mean trials to criterion and stand- 
ard deviations of the distribution for the 
eet groups on Day 1 are presented in Ta- 

e 2. 

Individual comparisons among trial 
means were made by the Scheffé method 
(Hays, 1963). The statistical results sup- 
port the following conclusions: (a) SES 
has no significant effect (F = 1.08, df = 
1/79, p > .10) upon the learning of the two 
tasks; (b) IQ had a significant effect 
(F = 16.00, df = 1/79, p < .001) upon the 
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learning of both tasks; (c) normal Ss 
learned the serial list in fewer trials than 
the retarded Ss (F = 5.41, df = 1/72, p < 
05); (d) normal Ss learned the paired-as- 
sociate list in fewer trials than the 
retarded Ss (F = 5.41, df = 1/72, p < 
05); (e) in both IQ groups, normal and 
retarded, the paired-associate task was 
much more difficult to learn than the cor- 
responding serial task (F = 144.42, df = 
1/72, p < .001); (f) the order of presen- 
tation of the two tasks made no difference 
in the learning of the tasks (F = 3.67, 
df = 1/79, p > .05). 

Table 2 also shows the means and stand- 
ard deviations for the mediation and non- 
mediation groups on Day 2. It is obvious 
from the table that the effect of mediation 
was to reduce drastically the number of 
trials to criterion. Learning in the media- 
tion groups was four times as fast as in the 
nonmediation group. Use of the Cochran 
test for homogeneity of variance revealed 
that the assumption of homogeneity was 
not tenable. Siegel (1956) recommends a 
nonparametric test, the Mann-Whitney U 
test, as the most useful alternative to the 
usual ¢ test. Aside from the significant ef- 
fect of mediation instruction, none of the 
other main effects was significant. In or- 
der to keep the overall error rate for two- 
way interactions in the range of .08, it was 
decided to conduct each separate test at 
02. There were two significant interac- 
tions, IQ x Instructions and SES x IQ. 
Although both ability groups profited from 
the mediating instructions, they did not do 
so to the same degree. Normal groups sur- 
passed retardates in speed of learning un- 
der mediated conditions. Normals main- 
tained their superiority in nonmediated 
conditions. Inspection of the SES x 1Q in- 
teraction shown in Figure 1 revealed that 
IQ differences in performance were solely 
a function of SES. Normals and retardates 
in high SES showed significant differences 
in performance, but this did not occur in 
the low-SES group where the normals and 
retardates learned at about the same rate, 

There was one significant three-way in- 
teraction—SES x 1Q x Treatment in the 
original analysis of covariance. It was 
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TABLE 2 | 
Means AND STaNDARD Deviations FoR TRIALS ON Day 1, Day 2, anv Day 3 
Day 1 Day 2 Day 3 ; 
eis So 
G ; 
fits Serial scone Mediation |Nonmediation| Mediation | Noumedi- | qui | 
wilsoliaeaisean| alow [so | a [so |x | so ae 
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Engtgus rewarded. | 486 | 1098 | tocas | 8:77 | 2:30 | 1:88 | “7:80 | 3:40 | “200 | Scan | Oc20 | S90 | “a O8 |B 
difficult to analyze the significance of this 12 
interaction by means of the Mann-Whit- 
ney U test due to the large number of sepa- _—'!! } 
rate tests involved. When such a largenum- 4 High SES Retardates ps By 
ber of separate tests is conducted, there is \y 
always the possibility of some being signifi- 9 a 
cant by chance alone. Of the 28 tests, 20 ae 
had p < .01 which is more than one would 8 -” Low SES! 
expect from chance. The SES x IQ x a "Retardatess/ | 
Treatment can best be shown graphically |, s ea 
by Figure 2. Examination of Figure 2im- 5 ¢ o, Normals ae 
mediately points up the striking difference © m4 a 
in performance between the two retarded 5 Ii a 
groups. The high-SES retarded is the least os 
efficient learner in both conditions, whereas + 
the low-SES retarded does not differ ap- 3 
preciably from the two normal groups. 
2 ‘ 
10, 
1 
9 
° ra 
8 Mediation Non - Mediation 
2 apis for 
Fic. 2. Trials to criterion of four groups 

v¢ Ferret pa _— mediation and nonmediation on Day 2. 

6 5 ap Table 2 also shows the means and slate 
2 ard deviations for all groups bs mel 
2 i is no! 
= The main treatment effect fferente 


Normals 


cant. There is no significant diffe 
(F = 0.68, df = 1/71, p = >-10) a 
learning of paired associates on Day | hi 
tween groups who had received medial a 
instruction on Day 2 as compared ae 
groups who had not received ee on 
Main effects were significant for IQ rea 
7.92, df = 1/71, p < .01) and SES 


7.90, df = 1/71, p < .01). The na 


Low SES 


High SES 


Fia. 1. Trials to criterion of normals and re- 
tardates on Day 2 as a function of social class, 


group required significantly 
criterion than the retarded. 


group surpassed the high-SES 


fewer trial 
The low-¥*) 
group m 
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High SES , 


Low SES 


Normals Retardates 


Fic. 3. Trials to criterion of high and low SES 
groups on Day 8 as a function of IQ. 


speed of learning. There was a significant 
SES x IQ interaction (F = 8.19, df = 
1/71, p < .01) shown in Figure 3. Pair- 
wise comparison by the Scheffé technique 
revealed that the locus of the SES effect 
was between the two retarded groups. 
Low-SES retarded Ss required fewer trials 
to criterion than high-SES retarded Ss. 
Again, the latter group did not differ from 
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their normal counterpart in performance, 
This was in contrast to the normal and re- 
tarded Ss from high SES who continued 
to show a significant difference in per- 
formance. The high-SES retardates took 
about twice the number of trials to reach 
criterion as the other groups. 


Intercorrelations among Variables 


Product moment correlations were de- 
termined for MA, IQ, and learning tasks, 
Due to the use of extreme groups, corre- 
lations are probably spuriously high but 
they reflect the pattern of correlations 
that would be found in the total popula- 
tion. Most interesting are the intercorrela- 
tions among measures in the high- and low- 
SES groups reported in Table 3. IQ cor- 
related significantly with all learning 
tasks in the high SES but not in the low 
SES. With the exception of serial learning, 
the differences between the correlation co- 
efficients found in the two SES groups are 
significant (p < .05). Therefore, one can 
conclude that there is a positive relation- 
ship between IQ and_ paired-associate 
learning for high-SES Ss but not for low- 
SES Ss. 


Discussion 


Two important findings resulted from 
the experiment. One, there is a significant 


TABLE 3 


INTERCORRELATIONS AMONG VARIABLE! 
(Above diagonal High SES; 


s ror Hicu- anv Low-SES Groups 
below diagonal Low SES) 


1 2 3 4 5 6 7 
Paired Paired Paired 
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° MA based on Stanford Binet Intelligence Scale and Kuhlman-Anderson Intelligence Test. 


> MA based on Peabody Picture Vocab’ 

° Since variables 4, 5, 6, 7 are trials-to-c 

Positive correlations between the psychome 
*p < 05. 
“p< Ol. 


ulary Test. 
riterion scores, i 
tric tests and the learning measures. 


these variables have been reflected to yield 
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difference in the learning ability of re- 
tardates from high SES as compared to 
retardates from low SES. Two, IQ is a bet- 
ter predictor of learning ability in the 
high-SES groups than in the low-SES 
groups. 

The retarded groups did not differ in 
performance on the first day of the ex- 
periment. By the end of the week, the two 
groups were strikingly different in rate of 
learning. The low-SES retardate was learn- 
ing paired-associates at a much faster 
rate than he did earlier in the experiment; 
the high-SES retardate was continuing to 
learn at the same slow rate. However, the 
surprising aspect of this result is that the 
improvement in learning ability among 
the low-SES retardates cannot be attrib- 
uted to instructions in mediation as the 
control group showed as much improve- 
ment as the experimental group on Day 3. 

What caused the sudden improvement 
among the low-SES retardates? These re- 
tardates were not able to indicate any par- 
ticular method they used to learn the 
lists when questioned at the conclusion of 
the experiment. Common replies were, 
“T said them over and over,” “I memorized 
them,” “I don’t know.” The first hypothe- 
sis that comes to mind is that the improve- 
ment is due to the effects of practice and 
learning to learn. 

All four groups showed some improve- 
ment due to practice, but why should the 
effect be greatest among the low-SES re- 
tarded? Past research (Covington, 1962; 
Haggard, 1954) has shown that lower-class 
Ss benefit most from the opportunity to 
practice or become familiar with a task. 
This is generally explained by the lack of 
opportunity among lower-class Ss to be- 
come familiar with different ways of re- 
sponding to a variety of stimuli so their 
performance will be markedly inferior to 
upper-class Ss in many learning situations, 
Once the lower-class Ss have the opportu- 
nity to practice, they will benefit more than 
the upper-class Ss. The upper-class S ben- 
efits less from practice as his performance 
is nearer the limits of his learning ability 
during the initial trials. However, this 
does not explain why a similar improve- 
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ment in learning did not occur among the 
normal Ss in the low-SES group. When 
one examines the raw data on Table 9 
there is some indication that the low-Shg 
normals benefit more from practice than 
the high-SES normals, The low-SES nor. 
mals took about two more trials to leam 
paired-associates on Day 1 as compared 


to the high-SES normals. By Day 3, ther | 


was little difference in trials between the 


two groups. Although the normal low-SHS | 


Ss did not learn as fast as the high-Sh§ 
normals on Day 1, they did much better 
than the retardates. It appears that the 
low-SES normals are somewhat more 
knowledgeable in techniques of paired-as- 
sociate learning and therefore do not bene- 
fit from practice to as large a degree as 
the low-SES retardates. 

The second finding showing greater dif- 
ference in learning ability between IQ 
groups in the high-SES than in the low 
SES sample is contrary to usual reports 
On Day 1 tasks, normal IQ Ss learned 
faster than retardates in both SES groups 
Over the rest of the tasks, there continued 
to be IQ differences in learning ability 
among high-SES Ss, but not among low- 
SES Ss where differences in learning ability 
gradually disappeared. Why should 1Q be 


a better predictor of learning ability in the | 


high-SES than in the low-SES group? 

‘As indicated earlier, education and 0t- 
cupation are crude measures of SES. Prob- 
ably, there is much more variability the 


kinds of environmental experiences in We | 


low SES than in the high SES. The wot 
of some investigators (Deutsch, 1968; 
Fells, 1951) suggests that children from 
upper-class homes have a more como? 
environment due to prolonged schoolité 
more family stability, and more an 
to books, magazines, and newspapers. Jess 
environment of lower-class children 8 wi 
uniform due to inconsistencies 2 ie: 
ing, more mobility, less family stability 
and less exposure to books and magazine 
Thus, the performance of the low 
will be more variable and unpre: 

These results are not meant 0 4, 
that there are not any organic ret@ 
among the low-SES population. 


Rather ; 
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the findings are consistent with the hy- 
pothesis that the distribution of learning 
ability in the low SES does not differ sig- 
nificantly from that found in the high 
SES. The experiment randomly sampled a 
small portion of the large population of 
low-IQ Ss in the low SHS, so it is possible 
that the sample included few if any or- 
ganic retardates. 

The experimental data supported Hy- 
potheses II and III, but failed to confirm 
Hypotheses I and IV. IQ was a valid in- 
dex of the serial learning rate of the two 
ability groups in both social classes, which 
was not predicted by Hypothesis I. It is 
noted from Table 2 that the difference was 
small between normals and retarded in 
number of trials to criterion. It is possible 
that the serial list was not sufficiently diffi- 
cult and that a longer list would differ- 
entiate more clearly among groups. An 
alternate hypothesis is that serial learn- 
ing may also depend on certain subskills 
which have not been analyzed as yet. 
Thus, the low-SES retardate may have 
been just. as handicapped by environmental 
deficiencies in certain subskills required by 
serial learning as he was on the paired-as- 
sociate task. Data on the learning of 
several serial lists might shed some light 
on this area, 

Paired-associates closely reflected IQ 
differences in both the high- and low-SES 
groups as predicted by Hypothesis II. The 
effect. of mediation instruction was to 
greatly facilitate paired-associate learning, 
confirming Hypothesis III. However, the 
use of verbal links did not wipe out 1Q 
differences as normals continued to main- 
tain their superiority. 

Data failed to support Hypothesis IV 
as there was no evidence of the mediation 
set being transferred to the learning of a 
new list of paired-associates on Day 3. 
However, one cannot conclude that re- 
tardates are basically deficient in the 
ability to transfer skills acquired in one 
situation to another as normal Ss did not 
show any superiority in retaining the 
mediation habit. The lack of difference 
between mediation and nonmediation 
groups on Day 3 can be explained in sev- 


eral ways. First, one training session is 
probably not sufficient to inculcate the habit 
of forming verbal chains between’ corre- 
sponding pairs on a paired-associate task, 
All of the retarded and many of the nor- 
mals were unable to describe any method 
which they had used to learn the paired- 
associates on Day 3. Second, some of the 
control Ss spontaneously developed habits 
of making verbal links which may have 
canceled any special benefits which had 
occurred in the experimental group. For 
example, among the high-SES normals 
half the Ss in the mediation group re- 
ported mediating, but just about the 
same number of the nonmediation group 
described the use of some kind of mnemonic 
device. Third, instruction in the use of 
verbal mediators may have interfered with 
previously established habits of learning. 
For example, one S who reported attempt- 
ing to make up sentences to link the 
pairs, as he had been taught on Day 2, 
took more trials to learn the paired-asso- 
ciate list on Day 3 than he had on Day 
1. Fourth, the mediation group may not 
have been able to make up as elaborate 
verbal chains as were provided by # on 
Day 2. Recent data by Rohwer (1965) in- 
dicate that the structure of language plays 
an important role in the amount of facili- 
tation produced by verbal chains. In their 
role as connectives, verbs produced the 
most facilitation and conjunctions the 
least, while prepositions were somewhere 
in the middle, Sentences provided by Z in 
the present investigation were made up of 
from five to seven words and always in- 
cluded a verb. 

The present experiment has shown that 
there are differences in learning abilities 
among mentally retarded which need to 
be considered in planning their educa- 
tional program. Low measured IQ and a 
history of educational failure should not 
be the only criteria for placement in a spe- 
cial class. An additional measure should 
consist of verifying whether in fact the 
mentally retarded cannot learn from ap- 
propriate experiences. The customary as- 
sessment fails to take into account, whether 
these persons are low in IQ and achieve- 


110 


ment due to organic deficiencies or due to 
an environment which has failed to pro- 
vide them with the necessary knowledge 
and skills. 
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The development of the meaning of adverbial modifiers was studied 
by pair-comparison and ranking methods. The adverbs studied con- 
sisted of slightly, somewhat, rather, pretty, unmodified form, quite, 
decidedly, unusually, very, and extremely. The scaling tasks were ad- 
ministered to Ss in grades 1, 2, 3, 4, 5, 6, 8, 10, 12, and college. Ob- 
tained scale values were highly reliable. Scalability was seen to relate 
positively to age-grade classification. Primary-grade Ss identified at 
least 3 adverb groups while adults identified about 6 groups. Correla- 
tions of scale values of primary-grade Ss with college Ss ranged from 
74 to 94. All other groups yielded correlations with college data above 
90. Some words were seen to shift in meaning as a function of age- 
grade group. Results are interpreted in terms of applications to gen- 
eral scaling methodology, measurement methodology with young 


children, and research in language development. 


The study of the quantification of the 
meaning of words has become increasingly 
important since Osgood (1952) introduced 
the semantic differential technique for 
studying connotative meaning. However, 
some word sets, such as certain adverbs 
and adjectives, apparently can be ordered 
along specific continua and might have 
general quantifiable denotative meaning. 

Reports of the actual scaling of word 
meaning are rare. Darley, Sherman, and 
Siegel (1959) scaled the abstraction level 
of nouns, adjectives, and verbs with a high 
degree of reliability and Cliff (1959) 
studied extensively the scaling of adverbial 
modifiers, Cliff was interested primarily 
in a theory of the use of adverb scale 
values as multipliers of the scale values of 
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adjectives. This latter study is of major 
interest. 

Cliff (1959) studied a set of 10 adverbial 
modifiers in combination with 15 adjec- 
tives. He administered the adverb-adjec- 
tive combinations as a successive-intervals 
judging task using an 11-point response 
continuum. His subjects (Ss) were college 
elementary psychology students. The ad- 
verbs generally scaled in the following or- 
der (from low to high intensity of modifi- 
cation): (a) slightly, (6) somewhat, (c) 
rather, (d) pretty, (¢) unmodified form, 
(f) quite, (g) decidedly, (h) unusually, 
(i) very, and (j) extremely. Clifi’s average 
scale values (reported by Dudek, 1959) 
were, respectively, .55, .69, .86, .92, 1.00, 
1.07, 1.20, 1.30, 1.30, and 1.53. (The scale 
is transformed to make the unmodified 
form the unit of the scale.) Cliff found 
that, usually, the choice of adjective did 
not affect the ordering of the adverbs. One 
notable exception was the adverb “pretty.” 
This adverb was seen to make unfavorable 
adjectives more extreme and to make favor- 
able adjectives less extreme. 

‘A third relevant paper (Dudek, 1959) 
was methodological in nature. Dudek com- 
pared scale values of the adverbs studied 
by Cliff as determined by successive in- 
tervals and by the constant-sum method. 
The two methods did not yield exactly 
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equivalent results; however, results were 
similar. Dudek’s average scale values for 
Cliff’s adverbs were, respectively, .74, .83, 
94, 1.02, 1.00, 1.28, 1.47, 1.68, 1.66, and 
2.07. Note that in only two cases are these 
values not in the same order as Cliff’s 
values. The two reversals are pairs of ad- 
verbs that are extremely close in meaning 
as evidenced by the similarity in magni- 
tude of the scale values. 

Allison (1963) showed the importance of 
the study of adverb meaning by applying 
Cliff's results to semantic differential 
methodology. Allison attempted to in- 
crease the range of scale values by identi- 
fying scale end points with adverbs of high 
modification. 

The present investigation extends the 
study of adverb meaning downward to the 
first-grade level. It was hypothesized that 
the ability to discriminate between adver- 
bial modifiers increases with chronological 
age and educational level. For example, it 
was believed that children in the primary 
grades could provide at least polar classifi- 
cation of adverbs—that is, they could 
identify adverbs as extreme modifiers or 
moderate modifiers. Older children were 
expected to make finer distinctions. Du- 
dek’s (1959) report indicates that college 
students distinguish about eight points 
(out of a possible 10) on an adverbial in- 
tensity of modification scale; however, this 
conclusion was not based on statistical 
tests of the differences in preferences for 
members of pairs. It was predicted that a 
monotonically increasing function will be 
observed relating age-grade group to the 
ability to discern adverbial modifiers, It 
was also predicted that a monotonically 
decreasing function will be observed re- 
lating age-grade group to the dispersion 
values of the adverbs. 

An important outcome is the determina- 
tion of differences in semantic meaning 
that various age groups give to the several 
adverbs. Another important outcome is in- 
sight into appropriate methodology for pre- 
senting abstract scaling tasks to primary 
grade children, 
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Characteristics of the Population 


The population of interest is the general pubj. 
school membership. It was desired to hoe 
upper-level group that consisted of highly edu. 
cated persons rather than members of the 
public since this group was used for the “criterion 
ranking.” For this reason, college sophomores were 
considered to be a good choice. Data obtained on 
college sophomores could also be compared to that 
of Cliff and Dudek to determine if generalization. 
to other groups and methodologies can be made 
safely at the upper range of ability and chrono. 
logical age. If these comparisons are favorable, 
then one can have more faith in the generalizations 
drawn from data from younger groups. 

The population was restricted to students who 
were within the typical chronological age range 
for their grade group. Thus, slow learners and 
students who have skipped grades were deleted, 
This step was taken to assure that grade groups 
will not have extreme variability with respect to 
chronological age. Thus, grade groups can be 
treated roughly as age groups. | 


Characteristics of the Sample 


Intact classroom groups were used for the entire 
study. Sampling of classroom groups was incidental 
within the limitations specified below. About 18) 
8s were chosen from each of the grade groups |; 
2, 3, 4, 5, 6, 8, 10, 12, and 13. No limitation wasim- | 
posed on the membership of the sample other than 


veach child must have had a chronological sg 


within 6 months of the modal age for his grat? 
group. This limitation did not apply to the college 
sample, In some grades, the limitation resulted in 
the discarding of considerable data. Final sample 
sizes ranged from 116 to 184. 5 

Care was taken to choose schools which ue 
not expected to be atypical in any way a 
the language achievement of the students. CI 
room groups chosen were not grouped mt any 
scheme of homogeneous grouping. Public is 
were chosen such that the elementary ooh 
feeder schools to the junior high schools am 
junior high schools were feeder schools to 
senior high school. Hopefully, this selection 1 
sulted in some control over socioeconomic | 
ferences among the grade groups. All schools ™ 
located in a suburban area near Atlanta, Geore!® 


General Design 


The dependent variables under study af est 
mates of the various adverb-sc 
namely, scale values, scale dispel 
number of discernible scale points for na Hs 
verbs used by Cliff and Dudek. The indepe 
variable is age-grade group. i + active 

The 10 adverbs were used with the | 
“large” to form the adverbial phrases use 
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sealing tasks. The neutral adjective “large” was 
chosen so that nonabstract examples could be used 
by the examiner in the task administration and by 
§s in their consideration of the various stimuli 
pairs, Each adverb-pair was presented in both 
orders of presentation for most Ss (exceptions are 
noted below). Thus, most Ss responded to a ran- 
domized list of 90 pairs, or two complete repli- 
cations. Also, each S was presented a randomized 
list of 10 adverbial phrases and was asked to rank 
these from low to high, The pair-comparisons al- 
ways preceded the ranking task. 

Materials and methods were modified as neces- 
sary for the younger Ss, Primary-grade Ss received 
materials printed with large type and instructions 
were simplified for them, The examiner read aloud 
any words upon request of middle-grade children. 
All tasks were read aloud to all Ss in the first and 
second grades. Example tasks were used to verify 
that all children understood the nature of the pair- 
comparison tasks and understood the concept of 
“Jargeness.” During the second-grade adminis- 
tration it was observed that the Ss took an un- 
usually large amount of time to complete all tasks. 
Therefore, part of the second-grade group and all 
of the first-grade group were asked to complete 
only one replication of the pair-comparisons tasks. 


Analyses 


Scale values. Scale values of each adverb were 
determined for each age group and for each repli- 
cation by pair-comparison methodology (Edwards, 
1957, pp. 19-36). 

Dispersion values. Dispersion values for each 
scale value were determined for each age group 
by pair-comparison methodology (Edwards, 1957, 
pp. 58-66). i 

The number of discernible scale points. This 
value was determined for each age group based on 
significance tests of differences in rank-totals. This 
procedure is based on the relationship of pair- 
comparison methodology to analysis-of-variance 
of ranked data (Dunn-Rankin, 1965; Dunn-Ran- 
kin & Wilcoxon, 1966). : 

Scalability Index. It was desired to have an in- 
dex that would show the degree to which the ad- 
verbs were scaled by each group of Ss. The co- 
efficient of agreement (u; Edwards, 1957, pp. 76- 
78) is one such index and it is reported. However, 
8 particular hypothesis that was to be tested in 
this study dealt with the number of distinguishable 
seale points in the adverb-scale of each group of 
8s and this hypothesis requires a different statistic. 

The initial analysis determined the adverb 
Pairs that were statistically distinguishable using 
Dunn-Rankin’s (1965) technique. This analysis, in 
some cases, divided the 10 adverbs into mutually 
exclusive sets. Usually, however, sets were not 
Mutually exclusive, but overlapped. For example, 
slightly” (sl) and “somewhat” (so) might be per- 
ceived as synonomous, “somewhat” and “rather’ 
(t) might be perceived as synonomous, but 


“slightly” and “rather” might be perceived as dif- 
ferent. The Dunn-Rankin multiple-range test 
would show sl = so, so = r, and sl <r, This result 
should be classified as “more scalable” than the re- 
sult sl = so = r, but it should be classified “less 
scalable” than the result sl < so < r. 

It was decided to use a simple coefficient of 
scalability that reflects the number of discernible 
pairs, The coefficient decided upon was the ratio 
of discernible adverb pairs to the total number of 
pairs (which was 45 in all cases), This ratio was 
expressed as a percentage and was called SI for 
“Scalability Index.” 

The coefficient SI has a possible range of 100. SI 
has a minimum of zero if no pairs yield statistically 
significant differences and SI takes the maximum 
of 100 if all pairs are mutually significantly dif- 
ferent. Only in the latter case does one get a total 
ordering of the stimuli. 

The investigators are not proposing SI as a new 
statistic to replace coefficients like the coefficient of 
agreement or Guttman’s reproducibility coefficient, 
It is a statistic that, in the judgment of the inves- 
tigators, adequately summarizes the results of each 
group of Ss so that a test of the developmental 
hypothesis of adverb meaning can be accomplished. 


Rxsvts 


Results are presented for each specific 
objective of the study. The first two ob- 
jectives were to determine, by pair-com- 
parison technique, the scale values and dis- 
criminal dispersions of each adverb for 
each age-grade group. The scale values for 
each analysis appear in Table 1 and the 
corresponding discriminal dispersions ap- 
pear in Table 2. The adverbs are in the 
same order as listed by Cliff (1956) and 
Dudek (1959). The abbreviations are 
slightly (sl), somewhat (so), rather (r), 
pretty (p), unmodified form (I), quite (q), 
decidedly (d), unusually (u), very (v), 
and extremely (e). The correlations be- 
tween scale values of replications within 
grade groups provide estimates of all relia- 
bilities of the pair-comparison data, The 
lowest of the 11 reliabilities was .92. Seven 
of the 11 correlations were .97 or higher. 
The correlations indicate a high reliability 
for scale values, even for the primary 
grades. No check was made of the correla- 
tions between rankings for individuals. 

The third and fourth objectives were to 
determine the number of discernible scale 
points at each grade level and to deter- 
mine at what age children can first begin to 
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TABLE 1 
Parr-Compartson Scate VaLuss For 10 Apverns sy Grape Lever 
Adverb 
Grade N Lita ei... 
sl so r Pp 1 q d u v 5 

1 187 —-23) —.26} —.08 | —.20 | —.34 | —.36 13] 23] 62] 4 

2a 102 —.22| —.46| —.10 | —.34 | —.32 | —.35 +20} 12) 163] 

2b 82 —.48| —.26] —.23 | —.26 | —.29 | -—.11 +29] 16] .57] 6 
2c 82 —.26) —.26| —.24 | —.40 | —.33 | —.18 24) 15] .56] | 
3a 158 —.69 | —.36] —.33 | —.35 | —.24 | —.03 21 | .24] 165] 109 | 
3b 158 —-56| —.42}) —.31 | —.41 | —.21 | —.12 -20 | .18] 60] 1.05 | 

4a 143 —.93 | —.75| —.28 | —.33 -04 07 14 | .20) 175} 1.0 

4b 143 —-84) —.63) —.33 | —.36 | —.02 03 +10 | .22] 83} 99 

5a 116 —1.19] —.76| —.26 | —.46 | —.23 -13 04 | .51] .79/ 1,4 

5b 116 —.99) —.70) —.84 | —.42 | —.35 -16 | —.08 | .48) .82) 149 

6a 128 —.97 | —.63| —.11 | —.23 | —.37 +24 :02 | 58} .57] .79 

6b 128 —-85) —.79] —.87 | —.40 | —.41 -O7 | —.25 | .49 | 1.02} 1.49 

8a 162 —1.07} —.51| —.05 | —.26 | —.47 -25 | —.02] .81] .52] 8 

8b 162 —-81) —.71|] —.88 | —.46 | —.61 +14 | —.38 | .79] .96 | 1.46 

123 —1.25| —.72| —.15 | —.22 | —.43 -23:| —.06 | .89] .66 | 1.04 

123 —1.04) —.84] —.46 | —.53 | —.56 | —.01 | —.28 | 1.04] 1.04] 1.65 

164 —1.48/ —.92) -—.12 | -—.37 | —.34 +26 -07 | 1.00] .70} 1.21 

164 —1.10 | -1.08} —.45 | —.55 | —.49 07 | —.20 | 1.01] 1.01) 1.7 

163 —1.40 | -1.01/ —.49 | -.51 | —.40 -85 30 | 1.15 | .68 | 1.0 

163 1.14 | -1.00] -—.55 | —.53 | —.47 17 21 | 1.08] .83 | 1.40 


discriminate between the adverbial modi- only, from the pair-comparison data, Ta 
fiers. The numbers of discernible scale bles 3 and 4 show the relevant results. Ta 
points were determined from the ranking ble 3 shows the rank-order of all adverbs 
data, and, in the case of Grades 1 and 2 for each group based on the ranking data. 


TABLE 2 
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TABLE 3 
Summary or Apvers-Scatina Rusuuts—Finat Orpprs 
AND INDISTINGUISHABLE ADVERBS 
Group Ranking data results 
1 R L Q Sl P So D U Vi E 
2 R Q L Sl 13 So U D Vv E 
3 Sl L R Pp So Q Vv D U E 
4 Sl L So R P Q D U Vv E 
5 Sl L So R P D Q v U E 
6 Sl L So R z= D Q v U E 
8 sl L So P R D Q Vv U E 
10 Sl So L e R D Q v U E 
12 Sl Ba nade L R D Q v U E 
18 sl So R L P Q D Vv U E 
Pair-comparisons results 

1 P Ri Q SCS SL TES RR 2) le ei 
So P L QU SRSLIR a Unt AV re ie 


Note.—Indistinguishable pairs are underlined—solid lines show 95% confidence level and dotted 


lines show 90% confidence level. 


Indistinguishable pairs are underlined. The 
numbers of distinguishable pairs, the cor- 
responding SI values, and the coefficients of 
agreement, u, appear in Table 4. 

It is apparent from these data that a 
large number of primary grade students 
can properly classify the adverbs into at 
least three sets. The number is sufficiently 
large in the first grade to yield 26 signifi- 
cant differences in scale values at the first- 
grade level. It is apparent, from the low 
u-values, that there is also considerable 
disagreement as well as agreement in the 
primary years. The differences between 
students at this level who can and who 
cannot correctly perform the scaling tasks 
should become the subject of a series of 
highly interesting and informative stud- 
ies. The fifth objective was to determine 
the relationship of age-grade group to the 
number of discernible seale points. 

‘The regression of adverb distinguisha- 
bility on age-grade group appears to yield 
basically monotonically increasing func- 
tion. Only two pairs of SI-values are not 
in monotonic order and only one u-value 
is out-of-place in regard to monotonic 
Tegression function. 


The pair-comparison data of the two 
youngest groups were also analyzed by 
Dunn-Rankin’s technique and these re- 
sults are also in Table 3. The results were 
slightly better than the ranking results. 


TABLE 4 


Scauanitity Inpexus py AGn- 
Grapp GRouPINGS 


Ranking Data 
Group N Distin- leon dota 
guishable | S1(7%)* 

il 187 26 58 14 
2 184 27 60 “4 
3 158 33 73 24 
4 143 35 78 32 
5 116 37 82 42 
6 128 37 82 33 
8 162 39 87 38 
10 123 38 | 84 46 
2 164 40 | 89 51 
13 163 30 | 87 59 


ee ee ee 
® Scalability Index—ratio of the number of 
distinguishable pairs to the possible maximum 
of distinguishable pairs (45) expressed as a per- 
centage. 
» Coefficients of agreement are based on scale 
values averaged over replications. 
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TABLE 5 
CorRELATIONS OF Parr-ComPaRIson ScaLE 
Vatuns For Eacu Group wiTH 
Avuur Scare VauuEs 


Group 13a 13b 
1 74 81 
2a 76 83 
2b 87 91 
2c 81 86 
3a 92 94 
3b 88 92 
4a 93 94 
4b 91 93 
5a 96 96 
5b 94 96 
6a 97 96 
6b 91 94 
8a 97 95 
8b 91 95 

10a 97 97 
10b 94 97 
12a 98 97 
12b 95 98 


Note—An r of .63 is significant at the .05 
level of confidence for df = 8. 


The number of distinguishable pairs was 
29 for the first grade (SI = 64%) and 28 
for the second grade (SI = 62%). 

The sixth objective deals with the rela- 
tionship of dispersion values of adverbs to 
age-grade group. It was hypothesized that 
the dispersion values would tend to in- 
crease with age. However, an inspection of 
Table 2 reveals no evidence of trends. One 
could conclude that the scale values are 
just as variable for adults as for primary 
grade children. However, the u-values of 
Table 4 reveal the degree to which adults 
are more in mutual agreement than were 
the younger children. 

The final objective was to compare the 
scale values of each grade to the scale 
values of adults (Group 13). The scale 
values of each group were correlated with 
the two sets of scale values for the adults. 
These correlations appear in Table 5, 

The relationship of the present results 
to the results presented by Cliff (1959) and 
Dudek (1959) are of considerable interest 
because of the differences in populations 
sampled and in scaling methodology. The 
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results of Cliff and Dudek were based op 
data from college sophomores, so the inter. 
correlations of their scale values and that 
of the sophomores in this study give eyi. 
dence of generalizability across samples 
and methodologies. Correlations were cal- 
culated between the scale values for the 
two replications of Group 13 and the aver. 
age scale value of Cliff’s “good” adjectives, 
Clif’s “bad” adjectives, Cliff’s overall 
averages, Dudek’s “favorable” adjectives, 
Dudek’s “unfavorable” adjectives, and 
Dudek’s overall averages. Each of these 12 
correlations was .96 or higher, indicating 
extremely high similarity among the results 
of the present study and the two previously 
reported studies. The 6 x 6 intercorrela- 
tion matrix of Cliff’s and Dudek’s results 
(unreported) had no value below .94. 

The correlations of the results of each 
grade group with the Cliff and Dudek re- 
sults are also of interest as evidence of 
generalizability of the developmental data. 
Since the adult intercorrelations were s0 
high, the correlations of each group with 
the Cliff and Dudek results were quite sim- 
ilar to the corresponding correlations 
with Group 13 so they are not reported, 
However, the correlations did give some 
additional information. There was a tend- 
ency for the values to be slightly larger for 
“good” and “favorable” adjectives than 
for “bad” or “unfavorable” adjectives. 
The present task involved a neutral ad- 
jective “large” which is apparently per- 
ceived as more like a positive than 4 
negative adjective. 


Discussion 


The results of the study were surprising 
in one regard—the younger Ss performe 
the task with considerably more skill es 
was anticipated. The reliabilities and the 
correlations with adult values were mu 
higher than was expected. On the other 
hand, the adult data, although highly 1° 
liable and highly consistent with results Bs 
ported in other studies, were not * 
ternally consistent as was expected. s 
claim is based on the low number of 20 
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and near-zero frequencies in the prefer- 
ence matrix for adults. In any case, all 
groups yielded highly reliable scale values. 

The investigators were especially pleas- 
antly surprised to see the ease with which 
the younger children handled the complex 
ranking task. The Ss did have considerable 
practice at making pair-wise decisions; 
however, in spite of this the full ranking 
task was expected to be extremely difficult. 
There is some weak evidence that the pair- 
comparison task did yield better data. The 
SI values were 58% (first grade) and 60% 
(second grade) for total ranking as com- 
pared to 64% and 62%, respectively, for 
pair-comparisons, 

One major value of the study is the de- 
termination of relative meaning of the ad- 
verbs as a function of age and training, 
The relative meanings can best be seen in 
Table 3, which lists the adverbs in in- 
creasing order of scale value. Note the shift 
in meaning of the words “somewhat” 
(so) and “slightly” (sl) from neutral at 
the primary level to extreme at the adult 
level. 

The word “very” is also of high interest 
due to the large use of it in defining re- 
sponses in attitude scaling. “Very” is seen 
as less strong than either “extremely” or 
“unusually” for grades higher than five. 
“Very” is seen as less strong than “ex- 
tremely” above Grade 2. But Grades 1 and 
2 equate “extremely” and “very.” 

It is expected that many scaling studies 
can be improved by using the reported 
Scale values to increase the variance of re- 
Sponses and clarify anchoring definitions. 
The researcher would need to choose ad- 
verb scale values corresponding to the age 
and training of his Ss. Note that if one 
wanted to scale objects according to 
“size,” one could list responses from “ex- 
tremely small” through “slightly small” 
and “slightly large’ up to “extremely 
large.” Cliff (1959) indicated consider- 
able invariance of adverb scale values with 
Tegard to the adjective modified, so one 
Would expect the values reported herein to 
apply to adjectives other than “large.” The 
Use of the various adverb scale values to 
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space responses conceivably could help 
assure that response intervals were indeed 
on a meaningful interval. In some studies, 
one could possibly apply the multiplica- 
tion rules of Cliff (1959) for determining 
adverb-adjective-pair scale values for the 
same purpose. The scale values can also 
be used creatively to bréak down lack of 
variation due to generosity effect. For ex- 
ample, an employee could be rated from 
“slightly good” to “very good” instead of 
from “very bad” to “very good.” 

In general, one can conclude that there 
is need for considerable care in the con- 
struction and administration of scaling 
tasks for young children. This conclusion is 
based largely on the dearth of methodolog- 
ical work at this age level. It is based in 
part on the findings of some differences in 
the meanings that various adverbs have for 
different age groups and for individuals 
within age groups. The meaning of words 
that are not routinely taught in vocabulary 
instruction cannot be assumed to have 
identical meaning to all persons, although 
the adverbs studied do have fairly stable 
meaning across age-grade group. 

There is also need for the study of other 
word-types and how the usage and meaning 
of words change with age and training. 
Perhaps the study of abstraction levels of 
words (Darley, Sherman, & Siegel, 1959) 
can provide a model for this needed body of 
research, 
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An equation to predict teaching effectiveness from the California 
Psychological Inventory (CPI) was developed on a sample of 293 
students (215 females, 78 males). The equation included scales for 
socialization, good impression, achievement via conformance, psycho- 
logical-mindedness, and flexibility, combined in this form: 14.743 + 


— 670G: 


i + .997Ac + .909Py — 446Fx. Cross-validation of the 


equation on 17 higher-rated vs. 17 lower-rated student teachers gave 
a t ratio of 4.27 (p < .001), and a biserial correlation of +.44. Con- 
ceptual analysis of the dimension defined by the equation suggested 
personological bases of conscientiousness vs. undercontrol of impulse 
for males, and of resoluteness versus indifference for females. 


The prediction of performance in stu- 
dent and/or professional teaching is one of 
the long-standing, unsolved, and perhaps 
(some would say) unsolvable problems of 
educational psychology. Still, in spite of 
its difficulty (cf. Barr, 1958), the issue con- 
tinues to excite the interest of researchers, 
and significant publications on teaching ef- 
fectiveness (cf. Biddle & Ellena, 1964; El- 
lena, Stevenson, & Webb, 1961; Ryans, 
1960) continue to attend to it. Gage’s 
Handbook (Gage, 1953), in fact, carries two 
chapters in which important discussions of 
the problem are to be found (Getzels & 
Jackson, 1963; Stern, 1963). Thus, although 
4 certain weariness is understandable, there 
isno need as yet to abandon inquiry. 

Specification of a criterion of effective- 
ness has been a principal difficulty, and in 
the judgment of Getzels and Jackson per- 
haps the major stumbling block. Tradi- 
tionally, there have been three ways of 
establishing the criterion: first, from an 
evaluation of the scholastic achievement of 
Students; second, from ratings or judg- 
Ments by supervisors who have observed 
the teacher in the classroom; and third, 
from ratings or judgments furnished by 
the students themselves. 

There are obvious problems in using the 
first method (e.g., controlling for intel- 
lectual level and past experience of the 
students), and current opinion (ef. Chaun- 
cey & Dobbin, 1964) is generally against 
its employment. 


The second method is more favored, 
particularly if observations can be system- 
atized so that both intra- and interjudge 
comparisons can be made. Various rating 
forms which fulfill these requirements have 
been developed. For example, Durflinger 
(1963) has devised a 41-item teacher- 
evaluation scale, and Michaelis (1954) 
has described a progress report form capa- 
ble of yielding quantitative indices. 

Ryans’ observational report form (Ry- 
ans, 1960) is another illustration; this form 
has been factor analyzed, leading to the 
description of three major patterns of 
classroom behavior: understanding and 
friendly; responsible and businesslike; and 
stimulating and imaginative. 

The third method, rating of teachers by 
students, has been less frequently em- 
ployed, but useful instruments similar to 
those of Durflinger and Ryans are begin- 
ning to appear. Veldman and Peck (1963) 
have introduced a 38-item Pupil Observa- 
tion Survey (POSR) which yields five 
factors: (a) friendly, cheerful; () knowl- 
edgeable, poised; (c) interesting, pre- 
ferred; (d) strict control; and (e) demo- 
cratic procedure. Factors I and IV seem 
to correlate highly with supervisors’ rat- 
ings of the same teachers. Factor II has a 
moderate correlation, whereas Factors IIT 
and V bear little relationship. 

The “criterion problem” is still far from 
solved, but as indicated above there do ap- 
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pear to be instruments and techniques 
available which can provide acceptable in- 
dices of performance. 

If we turn from the criterion to con- 
sideration of the psychological tests which 
might be applied as predictors of teaching 
effectiveness, further problems are en- 
countered. Much of the evidence is nega- 
tive. For example, in the intellectual realm, 
Durflinger obtained correlations of —.08, 
—.13, and —.13 for the quantitative, 
linguistic, and total scores on the Ameri- 
can Council on Education Psychological 
Examination (ACE) against supervisors’ 
ratings of 150 student teachers on his 41- 
item scale. Stern (1963) concluded that in- 
tellectual measures have in general been 
unpredictive in this realm. 

Several years ago, the Minnesota Teacher 
Attitude Inventory (Cook, Leeds, & Callis, 
1951) seemed very promising because of the 
statistical and psychological sophistication 
invested in its development. Unfortunately, 
research evidence has not always sustained 
this promise, Sandgren and Schmidt (1956) 
classified 393 student teachers in high-mid- 
dle-low categories on the MTAI, but these 
three categories bore no relationship to su- 
pervisors’ evaluations. Durflinger, in the 
study already mentioned, found a correla- 
tion of —.12 between the MTAI and the 
criterion of teaching effectiveness, and 
Burkard (1965) did not find the MTAI to 
be valid. 

The Minnesota Multiphasic Personality 
Inventory (Hathaway & McKinley, 1943) 
has been widely applied and because of its 
sensitivity to problems of maladjustment, 
neuroticism, and anxiety would seem to be 
relevant at least to ineffectiveness as a 
teacher. The research evidence, however, 
has not been encouraging. Tyler (1954) 
administered the MMPI to men enrolled 
in student teaching in secondary schools, 
and attempted to develop predictive in- 
dices on the basis of individual scales 
combinations of scales, and item analysis 
of the inventory; these efforts were unsuc- 
cessful when checked in cross-validation. 

Gough and Pemberton (1952) obtained 
moderately positive results in an attempt 
to forecast teaching effectiveness among fe- 
male subjects (Ss), using the “sign” ap- 
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proach on the MMPI, and Gowan and 
Gowan (1955) also obtained moderately 
favorable results in an item analysis of the 


inventory. However, Michaelis (1954), in 


a very searching study of the MMP] 


among 271 females in student teaching 


obtained negative findings for both scale 
and item analyses, and Moore and Cole 
(1957), using the MMPI, were unable to 
distinguish among teachers rated as best, 
above average, average, below average, 
and poorest. 

Michaelis also obtained relatively un- 
promising results with the Heston Personal 
Adjustment Inventory (Heston, 1949), the 
Minnesota Personality Scale (Darley & 
McNamara, 1941), and the Minnesota 
T-S-E Inventory (Evans & McConnell, 
1942). 

Perhaps the problem has been, as sug- 
gested by Peck (1960), that these in- 
ventories stress qualities which are not of 
great relevance to teaching, and hence are 
unable to achieve an adequate level of 
predictive accuracy. The California Psy- 
chological Inventory (Gough, 1957), de 
veloped to assess “folk dimensions” of in- 
terpersonal and interactional behavior, 
might be less subject to Peck’s animadvel- 
sion; that is, variables such as socialization, 
psychological-mindedness, 
(all of which are scaled on the inventory) 
might be more relevant to teaching perform 
ance and therefore useful in its prediction. 

Three studies employing the CPI to fore- 
cast teaching performance may be cited. 
The first, by Durflinger (1963), has al 
ready been mentioned. Against the ot 
terion of ratings, the 18 scales gave 
dividual correlations ranging from —27 
4.12, In the second study (Hill, 1960); 
student teachers were dichotomously las 
sified as better or poorer by faculty supél- 
visors, and an effort was made to i 
criminate between the two groups ; 
means of CPI scales. Some minor bear” 
were observable (e.g., higher scores 00 ) 
So scale for the higher-rated etry 4 
but the differentiations were far short © 
level of practical utility. stew 

Hill (1961) next conducted an ? 
analysis of the inventory, pitting | 
higher-rated versus 50 lower-rated 


and _ flexibility : 


_ 
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dents; 30 items significant at or beyond the 
05 level of confidence were identified. 
These 30 items were scored on hold-out 
samples of 49 versus 52, giving rise to a 
point biserial correlation of +.19. 

With such unencouraging results, a 
reader might well ask why one should con- 


tinue to investigate the issue. The answer 


{] 


is that in spite of rather modest findings 
in these studies, taken separately, certain 
common trends were observable, leading to 
the notion that a pooling of data, and new 
analyses seeking patterns and combina- 
tions of variables, might yield more sub- 
stantial returns. The remainder of this 
paper will report the outcome of an at- 
tempt to identify a predictive cluster of 
scores on the inventory. 


First ANALYSIS 


Because the Durflinger criterion was in 
continuous form (supervisors’ ratings on & 
41-item schedule), it was decided to begin 
with Ss from this study. The sample com- 
prised 91 female students from the Univer- 
sity of California, Santa Barbara, doing 
classroom teaching under supervision. Hach 
Swas rated by two different supervisors, one 
during the first semester of teaching and the 
other during a second semester; interjudge 
reliability of rating was found to be +.81. 
The combined score was used as a criterion. 

Correlations of these ratings with the 
Seales of the inventory were modest, rang- 
ing from —.18 for Fx (flexibility) and 
~.14 for Gi (good impression) up to +.13 
for Sy (sociability) and So (socialization). 

A search was therefore initiated for a 
combination of scales which might provide 
& more useful basis of prediction than any 
single scale taken alone, A stepwise multi- 
ple regression analysis of the 18 scales of 
the inventory against the teaching cri- 


| terion was conducted, giving rise to a five- 


variable equation, including Sy, So, and 
Py (psychological-mindedness, with posi- 
tive weights, and Gi and Fx with negative. 
Scores for the 91 Ss, computed according to 
the equation, correlated +.36 with the cri- 
terion, 
—— 

’This analysis and the others reported in the 
Paper were carried out at the University of Cali- 
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The next step in the analysis was to 
cross-validate this preliminary equation on 
the sample of 202 Ss drawn from Hill’s 
investigations. These Ss included 124 fe- 
males and 78 males, tested and observed 
in the instructional program at Ball State 
University in Muncie, Indiana. The cri- 
terion of teaching effectiveness for these 
students was dichotomous, with 63 fe- 
males and 35 males being classified by 
supervisors as superior (“A” group), and 
61 females and 43 males as inferior (“B” 
group). 

Table 1 presents means and standard 
deviations on the scales of the CPI for the 
higher-rated and lower-rated subsamples, 
and also the results of the ¢ tests for sig- 
nificance. The higher-rated males scored 
significantly higher on the Py scale (p < 
01), and significantly lower on Fx (p < 
.05). Higher-rated females exceeded lower- 
rated females on the Re (responsibility) 
and Ac (achievement via conformance) 
scales (p < .01) in both instances). 

When the preliminary equation was 
used to compute scores for these 202 Ss, 
higher-rated females had a mean of 50.37, 
SD 7.35, and lower-rated females a mean 
of 48.80, SD 7.95; the difference of 1.57 
gave a f-ratio of 1.13 (p = .26) and a 
biserial correlation of +.13. 

For males, the results were more en- 
couraging, with a mean of 51.86, SD 8.31 
for higher-rated Ss versus 47.07, SD 6.34 
for lower-rated; the difference of 4.79 gave 
a t ratio of 2.83 (p < .01), and a biserial 
correlation of +.37. These results on cross- 
validation of the preliminary equation, al- 
though modest, were considered sufficient 
to warrant further exploration. 

Step 3 in the analysis was & sequential 
multiple regression analysis conducted 
over the entire sample of 293 Ss. For this 
analysis, the 91 students from Durflinger’s 
project were dichotomized into 46 higher- 
rated and 45 lower-rated, on the basis of 
higher and lower totals on the rating form. 
Correlations of each scale with the 2-ver- 
sus-1 dichotomy were utilized in the analy- 


fornia, Berkeley, Computer Center. The authors 
wish to thank the Center for granting computa- 
tional time, and Quintin ‘Welch and Susan Hopkin 
for conducting the analyses. 
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TABLE 1 ‘ 
Comparison or Hicher-Ratep anp Lowser-Ratep Stupent THACHERS IN THE 
Baxi Stare University SAMPLes 4 
Males Females : 
eee a) 
Scale Higher Lower Higher Lower 
uM SD M SD diff M SD M SD dit 
Do 29.80 4.80 30.42 5.78 | —0.62 28.49 5.96 26.92 5.31 1.87 
Cs 19.86 2.98 19.93 3.35 | —0.07 21.48 3.17 20.95 3.73 | 0.53 
Sy 25.83 4.87 25.56 4.83 0.27 26.38 4.98 25.48 4.49 | 0.90 
Sp 37.71 5.47 37.05 5.58 0.66 35.35 5.44 35.05 5.20 | 0.30 
Sa 22.57 3.48 23.33 3.17 | —0.76 22.68 3.61 22.90 2.95 |—0.22 
Wb 39.29 2.74 38.42 4.12 0.87 38.44 4.15 37.03 5.06 1.41 
Re 31.97 4.06 32.42 3.61 | —0.45 34.79 2.44 32.93 3.97 1.86" 
So 39.80 5.52 37.53 4.90 2.27 41,46 4.67 40.23 5.51 1,23 
Se 30.49 7.10 29.91 6.49 0.58 33.48 6.28 30.92 8.29 | 2.56 
To 23.74 4.12 22.95 3.62 0.79 25.37 3.81 23.98 4.87 1,39 
Gi 18,71 6.18 18.98 5.84 | —0.27 19.33 5.41 18.13 7.29 1.20 
Cm 25.89 1.68 25.56 2.75 0.33 25.84 2.87 25.33 2.71 0.51 
Ac 29.97 4.14 29.00 4.08 0.97 30.49 3.15 28.46 4.00 | 2.03" 
Ai 20.14 3.90 19.67 2.97 0.47 21.75 3.07 21.02 3.96 0.73 
Te 40.17 | 4.11 | 39.72 | 4.53 0.45 41.24 | 4.43 | 39.21 | 5.96 | 2.03" 
Py 12.51 2.77 10.79 2.85 1,72** | 11.38 2.71 11.10 3.02 0.28 
Fx 9.26 3.44 9.74 3.92 | —0.48* 10.08 2.82 10.51 3.47 |-0.43 
Fe 15.40 3.56 17.51 3.79 | —2.11 23.51 3.38 24.07 3.53 |—0.56 
‘ Note. For males, N = 35 for higher, N = 43 for lower; for females, N = 63 for higher, N = 61 for 
ower, 
*p < 05. 
Dp < 01, 
sis, The equation derived from the regres- mean of an array of computed scores will 
sion analysis is offered below: converge on 50.00. t 
‘ t i i d to compute 
Teaching: effectiv This equation was then use 
: Airaid scores for each of the 293 students from 
= 14,743 + .334So0 — .670Gi the two programs. Table 2 presents a en 
a mary of the results obtained from an ana | 
i Sa + 909Py — .446Fx. ysis of these scores, All three differentia- 
The beta weights are for use with raw tions are significant, as of course would be 
scores on the five OPI scales, and the con- expected for application of an equation to 
stant of 14.743 has been set so that the the cases on which it was developed. 
TABLE 2 Cross- VALIDATION 
RELATIONSHIP BETWEEN SCORES ON THE : 
For TgacuIne Errectivenrss AND One The essential step for any equation oe 
Ratings in THE Sametes Usep To as this is cross-validation. About the be? 
Derive tHE Equation we were completing our analyses t0 iY 


Samples N | M | SD | dit} ¢ | mis 
University of Cali- 
High tated females” (40° (as 
-rat ie es 5 
Lowsrated females’ | 45 | St:ii| 3:99 | 2-00 | 2-88") -37 
nef cr 
-rated fo 63 | 51.93] 4.36 | 2. 
Lowraied fomaies | 61 i 5:36 | 2-28 | 2.50") 28 
High rated oni Ear 
Low-rated males | 43 | 49:00] s:05 | °° | 2-92") 40 
22 ct ek Us  aperleaie Das el pean 
*p< OL. 


point, a paper by Veldman and a 
(1965) was published in which CPI da ; 
were presented; these two authors hele 
been kind enough to permit cross-valid 
tion of the equation on their sample. lly 
The Ss studied by Veldman and Ke 
were 34 University of Texas women. | 
sample was dichotomized by supervi 
into two categories, more effective V° 


sors 
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less effective, and these classifications were 
confirmed by interpretation of scores on 
the Peck Incomplete Sentences Test for 
teaching potential (Peck, 1960). 

CPI scores were computed for each girl, 
using the formula given above. For the 17 
higher-rated student teachers, a mean of 
52.41, SD 5.17 was obtained, and for the 
17 lower-rated the mean was 47.94, SD 
6.59. The difference of 4.47 produced a t 
ratio of 4.27 (p < .001), and a biserial 
correlation of +.44. Although the sample 
of 84 is too small to permit broad general- 
ization, the results tend clearly to confirm 
the validity of the equation. 

The analyses reported to this juncture 
all deal with magnitudes and differences. 
An equally important question concerns 
the percentage of error which would occur 
if scores on the equation were used to 
classify individuals as higher- or lower- 
rated. 

The proper cutting point to use in such 
classification is that which most closely 
approximates the split given by the cri- 
terion dichotomies, The score of 52 best 
met this requirement. Therefore, students 
with scores of 52 or above were classified 
as “highs” and those with scores of 51 or 
below as “lows.” If a student is called high 
by the test and is also high on the cri- 
terion dichotomy, then he may be termed 
4 “hit.” Likewise, if a student is called 
low by the test, and is also low on the eri- 
terion, his classification is correct. An er- 
Tor in either direction may be called a 
“miss.” 

The question now becomes, what is the 
Percentage of “hits,” or the batting av- 
erage, if we use the equation to forecast 
igh versus low on the criterion? For the 
females from Santa Barbara, 60 (65.9%) 
of the 91 Ss are correctly classified. For the 
124 girls from Muncie, 70 (56.5%) were 
correctly designated. For Hill’s 78 males, 
52 (66.7%) were properly identified. And 
for the Veldman-Kelly Ss, the only sample 
Which is fully independent for purposes of 
toss-validation, 23 students (64.7%) were 
‘orrectly categorized. 

We must also compute the chance level 
for such classification, for unless the cri- 
terion split is precisely 50-50 the chance 
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level of accuracy will be greater than 50%; 
this occurs because the “best guess,” when 
the frequencies depart from a 50-50 basis, 
is that any individual will belong to the 
larger category. The chance base lines for 
the four samples in the preceding para- 
graph are as follows: Durflinger females, 
50.6%; Hill females, 50.8%; Hill males, 
55.1%; and Veldman-Kelly females, 
50.0%. 


ConcupruaL ANALYSIS 


Having demonstrated that the CPI 
equation for student teaching effectiveness 
has at least moderate validity, the next 
question to raise is this: “What kind of an 
individual is it, in everyday language and 
description, who is identified as a good 
prospect by this equation?” We wish to 
turn, in other words, from a consideration 
of the predictive validity of the equation 
to a study of its diagnostic implications. 

The methodology to be used in answer- 
ing this question is one which has been 
called “conceptual analysis [Gough, 1965].” 
Its essential feature involves study of in- 
dividuals rated higher or lower by the 
equation, and from identifying their prom- 
inent characteristics to infer the underly- 
ing psychological dimensionality of the 
measure. 

Two research samples were available for 
carrying out this conceptual analysis. The 
first was composed of 101 college males, 
members of three different fraternities, at 
the University of California, Berkeley. 
Each boy had taken the CPI and each had 
also been described, by five of his fellow- 
members, on the Adjective Check List 
(ACL; Gough & Heilbrun, 1965). By sum- 
ming the number of times a word was 
checked about a boy by these five peers, a 
descriptive total was obtained for each of 
the 300 words in the instrument, Within 
each of the three subsamples, these 300 to- 
tals were converted to standard scores, so 
that the three subsamples could be com- 
bined into the one large sample of 101 Ss. 

From the CPI protocols, a student teach- 
ing effectiveness score was also computed 
for each boy, using the equation cited 
above. Then, this CPI score was correlated 
with the 300 peer descriptions gathered by 
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means of the ACL. A significant positive 
correlation between the CPI score and a 
word on the ACL would mean that this 
word was used in a differential way to de- 
scribe Ss scoring high on the equation; a 
significant negative correlation would 
mean that the word was used in a differ- 
ential way to describe boys with low scores 
on the question. By accumulating the 
words with highest positive and largest 
negative correlations, a verbal portrait of 
the high-scoring and low-scoring boy can 
tentatively be drawn. 

For girls, a similar sample of 92 Ss from 
two sororities at the University of Cali- 
fornia, Berkeley, was available.? Here too, 
each girl had taken the CPI, and had been 
described on the ACL by five of her peers. 
Within each subsample, the 300 ACL totals 
were converted to standard scores so that 
the correlations between the equation and 
the 300 words could be computed on the 
full sample of 92 Ss, 

For the males, more than 35 correlations 
were significant at the .01 level of confi- 
dence (3 would be expected by chance), so 
the pattern of relationships between peer 
descriptions and the CPI index appears to 
be reliable. To clarify the findings and 
render the patterns more easily discerni- 
ble, the 12 words with highest positive cor- 
relations will be listed first, and then the 
12 words with largest negative correlations, 

The 12 words used most typically to de- 
scribe college males scoring high on the 
CPI equation were these (the words are 
sine in order of magnitude of correla- 
tion) : 


conscientious 
practical 
tational 
moderate 
methodical 
planful 


* Use of samples of fraternity and sorori - 
bers should not bias the findings, as it ge 
shown in earlier work (Gough, 1965) that essen- 
tially equivalent diagnostic implications of test 
variables are obtained from samples differing sig- 
nificantly on age, educational level, and occupa- 
tional status. Sex differences are important in 
conceptual analysis, and for this reason separate 
inquiry was undertaken for male and female Ss, 
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responsible 
logical 
reasonable 
capable 
thorough 
reserved 


The correlations range from a low of +29 


(for “reserved”) to a high of +,87 (for 
“conscientious”). 

What does this set of 12 descriptions sug. 
gest about the high-scorer? He seems to be 
a diligent, effective individual, well-or- 
ganized, attentive to the practical demands 
of his work, and thorough and conscientious 


in carrying out his duties. In temperament 
he is self-disciplined and reserved, not atall 


flamboyant or unconventional. He is the 
kind of person who can be counted on to 
display discretion and good judgment in 
any situation. 


What about the low-scoring male on the 


equation? The 12 words used most differen- 
tially to describe him were these: 


reckless 
daring 
pleasure-seeking 
spendthrift 
irresponsible 
flirtatious 
show-off 
spontaneous 
adventurous 
michievous 
quick 
careless 


A “syndrome” of temperament and be- 


havior seems clearly evident in this cl 
of descriptions. The low man on the 


equation for forecasting teaching effective: 


ness appears to be undercontrolled, Ut 


bridled, too much dominated by his es 
impulses, Although in many ways a0 8 — 


tractive personality (spontaneous, advel- 


turous, quick), and probably original 2 


his perceptions and ideas, he is too 
responsible and too careless to perlora 
fectively in a day-by-day classroom §! 
ation. 

While pondering these characterolo 
implications of the CPI equation, 
should note some of the factors whi 


gical 
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not included: The equation does not ap- 
pear in any way to rest on intellectual 
ability, charm, assertiveness, or other 
qualities which one might hypothesize as 
determiners of scores on the inventory. The 
two patterns (for high-scoring and low- 
scoring males) are not merely good and 
bad, but rather two patterns more or less 
compatible with the demands of the cri- 
terion. 

Attention is now directed to the females. 
Independent analyses are required be- 
cause of the common finding that the same 
variable will have different implications for 
males and females. We should not, there- 
fore, expect to find that the CPI equa- 
tion rests on the same psychodynamic basis 
for both sexes. 

As with the males, more than 35 descrip- 
tions correlated at or beyond the .01 level 
of confidence, and to clarify the relation- 
ships only the 12 words with highest posi- 
tive and 12 with largest negative correla- 
tions will be listed. The 12 words used most 
poeneally to describe high-scorers were 
these: 


dominant 
persevering 
persistent 
serious 
opinionated 
ambitious 
demanding 
logical 
rigid 
clear-thinking 
determined 
responsible 


What was expected to happen did hap- 
pen: The woman identified by the equa- 
tion as a potentially effective student 
teacher is rather different from the man so 
identified. The high-scoring young lady is 
a strong and resourceful individual, clear 
and explicit, about her goals, and resolute 
in pursuing them. In fact, her seriousness 
of purpose and determination are such that 

ose who know her well find her some- 
what rigid and opinionated, however 
Worthy her ambitions and steadfastness. 

The pattern has a touch of the negative 
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in it, but one cannot conclude that § 
would be domineering or authoritarian in 
class, nor can one assume that pupils would 
find her objectionable. Retracing the chain 
of relationships which led to the de- 
velopment of the equation, we must recall 
that the original criterion consisted of rat- 
ings of effectiveness by supervisors thor- 
oughly grounded in education philosophy 
and sensitive to any hint of autocratic or 
manipulative behavior. 

The effective woman teacher, the ad- 
jectival analyses therefore suggest, may be 
one who, although single-minded and in- 
exorable in her resolve, can nonetheless 
deal with her students in an insightful and 
responsible manner. Her personal friends 
may see a touch of inflexibility and dog- 
matism in her beliefs, but her students may 
experience these qualities as decisiveness 
and clarity. 

Finally, what of the low-scoring college 
woman, how is she described by her 
friends and peers? The 12 adjectives most 
differentially applied were these: 


curious 

affectionate 

careless 

easy going 

unconventional 

dreamy 

understanding 

irresponsible 

cheerful 

natural 

individualistic 

thoughtful 

Some of these words duplicate descrip- 

tions found for the low-scoring males (e.g., 
careless and irresponsible), but the flavor 
of the cluster is different. The low-scoring 
female is somewhat undercontrolled, to be 
sure, but she is nonetheless affectionate, 
thoughtful, and of an optimistic turn of 
mind. Hostility, aggression, rebellious- 
ness,—all qualities which one might hy- 
pothesize as negatively related to teaching 
effectiveness—are alien to the pattern 
actually delineated. Our low-scoring S$ 
seems very likable, easy to get along with, 
a pleasant and undemanding individual. 


126 


But as a teacher she will not do; her lack 
of organization, overresponsiveness to dis- 
tractions of the moment, and indifference 
to practical realities are drawbacks too 
great to be ignored. 

The methodology of conceptual analysis, 
although. simple in design and in applica- 
tion, is sufficiently contrary to “ordinary” 
procedure so that some readers may mis- 
understand what is being claimed for the 
interpretations sketched above. The four 
characterizations, it should be stressed, are 
not offered as descriptions of good and poor 
teachers, male and female, Rather, they 
seek to define four syndromes which are 
diagnosed by the CPI equation. If a male 
scores high: on the equation, then it is 
likely that he will resemble the person de- 
scribed in the first portrait; and, we hasten 
to add, it is also likely that he will be an 
effective teacher and that supervisors will 
agree on his effectiveness. 

If a female college student scores high on 
the CPI equation, she may be expected to 
be characterized by the constellation of 
traits and dispositions sketched in the 
third portrait, And, should she participate 
in student teaching, we can anticipate 
superior performance and high ratings by 
her supervisors, 

The same admonitions hold for the 
portrayals of low-scorers on the equation: 
Low-scoring males will tend to be im- 
provident, impulsive, and adventurous in- 
dividuals—and poor candidates for teach- 
ing training; low-scoring females will tend 
to be impractical and undependable, albeit 
charming, and also poor risks for such 
training. There are many, perhaps innu- 
merable, ways of being a poor teacher; the 
function of the CPI equation is to identify 
two important routes (one for each sex), 
and the purpose of conceptual analysis is 
to specify their personological parameters, 
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TEACHERS’ RATINGS OF STUDENT PERSONALITY TRAITs 
AS THEY RELATE TO IQ AND SOCIAL DESIRABILITY: 
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4 groups of 2nd- and 3rd-grade school children varying in their IQ 
status and their test anxiety status were rated by their teachers on 
a series of 24 personality and school performance characteristics. 
No relationships were found between the teachers’ ratings of their 
children on these characteristics and the childrens’ test anxiety 
status. Significant relationships were found between the teachers’ rat- 
ings on 14 of the 24 characteristics and the childrens’ IQ status. It 
was further found that there existed a relationship between the ex- 
tent to which a characteristic was judged to be desirable and the ex- 
tent to which the item differentiated between high- and low-IQ chil- 
dren, in the direction of the desirable characteristics being more 
frequently attributed to the high-IQ children. Some evidence was pre- 
sented that supported the position that these evaluations reflected, at 
least in part, the biases of the teacher raters rather than simply actual 


behavioral differences of the children. 


In most educational systems it is ex- 
pected that a teacher be aware of psycho- 
logical differences that exist among his 
students. In some instances the process of 
evaluation of nonacademic behavioral 
traits has grown to the point where the 
modern elementary school teacher must be 
as attentive to a student’s entire personal- 
ity development as he is to his intellectual 
growth. Thus, a student’s permanent school 
record now often contains teachers’ assess- 
ments which, when dressed in the psycholo- 
gist’s Jargon, appear under such headings 
as “extent of peer group dependence,” “con- 
trol of aggression,” ‘Sntroversion,” “anx- 
iety,” and so forth. These assessments are 
used not only in predicting a student’s per- 
formance within the elementary school 
setting, but as part of his permanent rec- 
ord play an important role even in the 
screening of applicants for high school 
and college awards and fellowships, as 
well as for admission to graduate and pro- 
fessional schools. 

Moreover, it should be recognized that 
the formal categorization of student per- 
sonality traits by teachers can exert not 


*This study was financed by a grant to Yale 
University (Seymour B. Sarason, principal inyes- 
bah from the National Institute of Mental 
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only a controlling influence on how that 
particular teacher then perceives, orga- 
nizes, and interprets later behavior, but a8 
part of the student’s record, it may also 
establish the frames of reference through 
which subsequent teachers view the sti- 


dent. The question of how well teachers cal | 


perform such an evaluative function is ob- 
viously of first-rate importance. 

The present study represents an at 
tempt to investigate the ability of, ele- 
mentary school teachers to discriminate 
among their students on relevant and im 
portant personality variables. Further, 
attempt was made to investigate the ee 
lationship between teachers’ ratings @ 
their students on these personality val 
bles and the students’ IQ test scores. 


MerHop 


Subject Selection ie 
Alll of the second- and third-grade school 


dren of Hamden, Connecticut, were (TASO) and 4 


the Test Anxiety Scale for Children of 
the Lorge-Thorndike intelligence test as 8 Pa 
a longitudinal study being carried out at >. Brom 
versity (Sarason, Hill, & Zimbardo, 1960). ses 
this population, those children scoring iD ‘ety dit 
and lower fifteenth percentile of the a vm the 
tribution (N = 320) were selected to ) grou 
high-anxiety (HA) and low-anxiety ( aubject® 
Tespectively. Within each of these groups 
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(8s) were matched on IQ score as closely as possi- 
ble in order to obtain subgroups of relatively high 
1Q (HIQ) and low IQ (LIQ). The effectiveness of 
this matching is evident from the group mean IQ 
of 115 for the HA-HIQ group and 116 for the LA- 
HIQ, as well as from the means of 99 for the HA- 
LIQ and 97 for the LA-LIQ groups. A total of 96 
8s were thus finally chosen, 24 in each of these 
four experimental groups. 


Teachers’ Ratings 


These 96 students were rated by their classroom 
teachers (N = 54) on 24 personality and school 
performance characteristics. These characteristics 
were presented in pairs of contrasting trait names, 
along with a working definition of each of the 
terms in the pair. Each teacher had to decide first 
which of the two terms most accurately described 
the child, then she had to determine by use of a 
6-point scale the degree to which the child ap- 
proached the extreme description given for that 
trait. Each teacher was given a chance to discuss 


TABLE 1 


Tzacumrs’ Rating ScaLE oF 
Srupent CHARACTERISTICS 


Trait No. 


1, Anxious: Unanxious 
2. (Dependent: Independent) 
3. Shows or expresses emotions: Hides or sup- 
presses emotions 
4. (Communicates easily: Difficulty communi- 
cating) 
5. Aggressive: Submissive 
6. Impulsive: Cautious 
7, Sensitive: Not sensitive 
8. Tense: Relaxed 
9. (Ambitious: Unambitious) 
. (Adapts to changes: Set in ways) 
11. (Well-liked: not well-liked) 
(Mature psychologically or emotionally: 
Immature psychologically or emotionally) 
: Withdraws: Sociable 
- Daydreams: Does not daydream 
. Active: Inactive 
. Overachievers: Underachievers 
+ (Learns slowly (new material): Learns quickly 
(new material)) i 
(Retains material: Forgets material) 
. Fears failure: Does not fear failure _ 
. (Pays attention: Does not pay attention) 
- (Strong conscience: Weak conscience) 
. (Feminine: Masculine) 
- (Pessimistic: Optimistic) 
- (Responsible; Not responsible) 


_Note.—Trait numbers in parentheses have 
highest agreement as to the desirability of that 
trait, while those not in parentheses have least 
8greement, and thus are least clearly positive or 
Negative. 
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this rating task with an experimenter to ‘insure 
that she understood what was being asked of her, 
After this discussion, each teacher worked on the 
task privately on her own time. The complete list 
of traits is given in Table 1. 

The obtained ratings were then summed across 
Ss within each group for each of the 24 traits sep- 
arately, and then subjected to simple between-Ss 
analyses of variance with test anxiety level and 
1Q group as main effects, 


Desirability of Trait Scoring 


A second group of judges was used to provide 
additional information about the social desirability 
of each of the traits used in the teacher rating 
schedule. These data were necessary to test an 
hypothesis which emerged after preliminary anal- 
ysis of the ratings. Fifteen teachers in the Yale 
Master of Arts in Teaching program (M.A,T.), all 
of whom had some teaching experience, were asked 
to judge how desirable it was for an elementary 
school child to exhibit each one of the bipolar 
traits which defined the 24 trait dimensions. 

These judgments were made on 10-point scales 
and were computed in such a way that high scores 
for a given item would indicate agreement among 
the judges as to its desirability, while low scores 
would indicate disagreement or lack of clarity 
about the desirableness of that item. The items 
were presented to the judges using the same format 
as was used in the presentation to the teachers 
when they rated their own students. 

The relationship between how conceptually 
clear the desirability of a trait was and how well 
it discriminated on the teacher ratings between IQ 
groups was established by means of correlation. 
The correlation coefficient obtained was between 
the mean desirability score of a trait as determined 
by the M.A.T. judges, and the difference in the 
total teacher rating score between high- and low- 
IQ groups on that trait. Thus, a high positive cor- 
relation would indicate that the better a trait dif- 
ferentiated between high- and low-IQ Ss, the 
greater the agreement among judges as to its de- 
sirability. 

Resvits 


The teachers did not differentiate in their 
ratings between LA and HA children on 
any of the 24 traits. The nonsignificant dif- 
ferences between anxiety groups can be 
seen from the F values presented in Table 


“On the other hand, it is obvious from 
the rest of the evidence presented in this 
table that the teachers did discriminate on 
the basis of IQ. On 14 of 24 traits, stu- 
dents with high IQ were characterized dif- 
ferently from those with low IQ at beyond 
the .05 level of confidence. Over the com- 
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TABLE 2 


F-Vauuns or Anxtety AND IQ Sources or 
VARIANCE FoR Eacu Trait 


Trait Anxiety 1Q 
1 84 1,30 
2 07 16,39** 
3 -00 1.98 
4 43 7 
5 1.00 5.70* 
6 3.25 2.97 
7 1.29 6.34* 
8 21 75 
9 25 10,.97** 
10 -98 2.43 
ll 23 4,87* 
12 49 7.86** 
13 1.83 12,28** 
4 03 4.74" 
15 73 8.20** 
16 2.12 5.43* 
Vi 07 28.38** 
18 -03 13.84"* 
19 -01 1.73 
20 -67 7.20** 
21 2.27 2.79 
22 -08 45 
23 37 3.36 
24 01 7.34** 


Note—tTrait numbers correspond to traits 
listed in Table 1, 


bined teacher ratings the difference between 
HIQ and LIQ Ss was highly significant 
(F = 20.08, p < .001). 

Although none of the second-order effects 
was significant, nevertheless, in most in- 
stances the differences in teacher ratings 
between Ss of the two IQ groups were 
greater among HA Ss than among LA 
ones. Thus, LA children tended to be 
rated more similarly, whether their IQ was 
high or low, than were HA children. To 
substantiate this observation, a difference 
score for each trait was generated by sub- 
tracting the scores of the high-IQ Ss from 
the low-IQ Ss within each of the anxiety 
groups. The analysis of these difference 
scores, using a ¢ test with nonindependent 
observations, demonstrated that difference 
scores on these 24 traits between IQ sub- 
groups within the HA group were signifi- 
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cantly greater than those difference 7 


of the IQ subgroups within the LA group 
(t = 2.45, p < .05). 

A further examination of the data re 
vealed that there were marked differences 
in how well various traits differentiated 
high- from low-IQ Ss. It appeared that 
those items that showed the largest differ. 
ences in scores between the two IQ groups 
were the items that were most clearly de- 
sirable traits for an elementary school 
child to have. Apparently, then, this in- 
consistency in the ability of items to dif- 
ferentiate between high and low Ss was re- 
lated to a tendency on the part of the 
teachers to assign more favorable ratings 
to high-IQ Ss to the extent that it was 
clear to them what a favorable rating 
would be, To investigate this notion a cor- 
relation was computed between IQ group 
differences on each trait and the desira- 
bility score of that trait as judged by the 
M.A.T. teachers. 

The traits in parentheses in Table 1 are 
those that demonstrated the highest clarity 
of desirability (ie, above the median), 
while for the others there was a lack of 
agreement among judges as to whether one 
of the bipolar traits was desirable and the 
other one undesirable for a child to exhibit 
in school. The product-moment correlation 
between item discriminability and desira- 
bility was .60 (p < .01). 


Discussion 


The results of this study can be inter 
preted in such a way as to cast sr 
doubt on the validity of ratings by & 
mentary school teachers of student pe 
sonality traits. The present teachers ve 
unable to distinguish reliably betwetl 
students who were extremely differet! 


by their own self report, in test anxiety, 


. b 
traits that have been found in past gt 
to be related to the variable of test 


‘ 
iety. Although other interpretations a 


possible, these data are consistent * 
the earlier conclusion drawn from 
search of Sarason. 


nga of 
In none of our studies using teachers’ rating? 


ite ne 


Teacuer’s Ratinos or Srupent Personatiry 


anxiety in relation to a test performance or a 
child’s self-report criterion is there evidence that 
teachers can recognize the anxious child to a degree 
which would be of practical significance [Sarason, 
Davidson, Lighthall, Waite, & Ruebush, 1960, p. 
265]. 

It does appear from our data, however, 
that teachers are sensitive to differences in 
IQ level. This sensitivity is reflected in 
their differential rating of children with 
high- and low-IQ scores on a wide variety 
of academic and personality traits. The 
child with a high IQ tends to be per- 
ceived, relative to a child with a low IQ, 
as one who learns quickly, pays attention, 
retains material, overachieves, and is am- 
bitious. Such traits are the obvious cor- 
relates of high IQ for the adequately mo- 
tivated student. However, even on traits 
which bear little correspondence to aca- 
demic performance and intellectual func- 
tioning, teachers discriminate between chil- 
dren of different intelligence levels. The 
bright child tends to be seen by the teach- 
ers as being less dependent and daydreamy 
and more aggressive, while at the same 
time being more sensitive, mature, socia- 
ble, popular, and active. 

The issue which becomes immediately 
apparent is whether these evaluations are 
reflecting actually occurring behavioral 
differences, or whether they are distortions 
of social reality. Without an independent 
criterion analysis of each of these traits, 
there cannot be an unequivocal answer to 
such a problem. However, several lines of 
converging evidence lead us to believe that 
& major source of variance in these ratings 
is accounted for by teacher bias in percep- 
tion as a consequence of overly positive 
evaluations of bright children. First, it 
was demonstrated that these teachers dis- 
tinguished most clearly between IQ levels 
on those traits which could be most easily 
Categorized as desirable or undesirable for 
4 child to possess or exhibit in school. Sec- 
ond, it was learned (after completion of 
our data collection) that all teachers in the 
Sample had access to, and were familiar 
with, information pertaining to the results 
of the IQ and achievement testing which 
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were routinely conducted in this school 
system. Finally, a biasing in terms of a 
halo effect appears to be a tenable expla- 
nation for this data, since teachers evalu- 
ated anxious children who were bright 
differently from anxious children who were 
not, and did so on traits shown by previ- 
ous research not to be characteristic of the 
bright, anxious child. Thus, for example, 
while these teachers characterized the 
bright, anxious child with the desirable 
traits of “independence” and “adaptabil- 
ity,” it has been demonstrated that the 
bright but anxious child is extremely de- 
pendent upon task and instruction factors 
as well as upon the approval of authority 
figures, all of which inhibit spontaneity, in- 


dependence, personal expression, and 
flexibility in school settings (Sarason et al., 
1960). 


It is also interesting to note that teachers 
do not consistently differentiate in their 
ratings of the bright and the nonbright 
children when these students have low lev- 
els of test anxiety. In some way then, the 
bright, anxious child is perceived as spe- 
cial and possessing traits of which teach- 
ers approve. In short, teachers are most 
positive about such children. 

It is likely, however, that this favor- 
able attitude is engendered in large part by 
the child’s dependent need for approval by 
the teacher and by his attempts to secure 
it. By not recognizing that such a child is 
frequently experiencing anxiety in relation 
to school and the resultant evaluation of 
his abilities (the definition implicit in the 
test anxiety construct), teachers are un- 
able to provide the help necessary to im- 
prove the child’s selfconception, and in 
fact may even reinforce these test anxiety 
attitudes. In turn, these attitudes may 
generalize to influence many kinds of be- 
havior (e.g., speech, as shown by Zimbardo, 
Barnard, & Berkowitz, 1963) and become 
the core of an enduring personality syn- 
drome. Such a pattern of attitudes may 
cause many of the bright but anxious stu- 
dents who do get into college to lower 
their levels of aspiration, and become 
satisfied with minimal standards of learn- 
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ing which do not require utilization of their 
full intellectual and creative capacity 
(Mandler & Sarason, 1952). 

The results of this study are consistent 
with a recent plea for teacher training in 
the knowledge and use of psychological 
variables, as well as in detection of subtle 
stimulus cues necessary for the understand- 
ing, modification, and control of behavior 
(Sarason, Davidson, & Blatt, 1962). The 
questions raised for future research are the 
extent to which knowledge of a child’s IQ 
or anxiety level influences the overt class- 
room behavior of teachers in their handling 
of their students, and to what extent is this 
recognition perceived and reacted to by the 
students. 


J. W. Barwarp, P. G. Zrwparpo, anv S. B. Sarason 
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COGNITIVE TRANSFER 


IN VERBAL LEARNING: 


II. TRANSFER EFFECTS AFTER PREFAMILIARIZATION WITH 


INTEGRATED VERSUS PARTIALLY INTEGRATED 
VERBAL-PERCEPTUAL STRUCTURES 


JAMES H. REYNOLDS* 
Colgate University 


A previous experiment (Reynolds, 1966) showed that prefamiliariza- 
tion of verbal items embedded in an integrated and meaningful map 
structure produced positive transfer to the learning of sentences re- 
lated to the map configuration. The present study extended the initial 
findings by showing (a) that prefamiliarization with a single, inte- 
grated map structure produced greater transfer to sentence learning 
than did prefamiliarization with the same map which had been frag- 
mented into separate and discrete pictures; (b) that the positive 
transfer effect can be obtained after a 10-min. rest interval between 
the familiarization and the learning tasks; and (c) that the effect can 
be obtained with differing materials, testing methods, and age groups. 
The results were interpreted as evidence that the transfer observed 
was due to the wholeness or completeness of the map structure, and 
that both the first and the present experiment demonstrate a stable 


cognitive mechanism which facilitates rote learning of sentences. 


Previous research (Reynolds, 1966) has 
shown that prefamiliarization of verbal 
stimuli embedded in an integrated and 
meaningful pictorial map produced positive 
transfer to the learning of simple sen- 
tences which contained factual material 
telated to the previously-studied map. The 
transfer obtained was significantly greater 
than that obtained for any of five control 
conditions which received prefamiliariza- 
tion with the verbal stimuli, the pictorial 
map, or nonintegrated combinations of 
each. It was concluded that the integration 
of the verbal and perceptual stimuli into a 
Single meaningful structure, rather than 
familiarization with these components sep- 
arately, was responsible for the positive 
transfer observed. The theoretical explana- 
tion of this result was that the integration 
of verbal and pictorial stimuli into a single 
Meaningful whole permitted the formation 
of an assumed mental organization akin to 
Tolman’s (1948) concept of a “cognitive 
Map,” which persisted in memory and 
aided later rote learning of the related sen- 
tences, 

The present experiment attempted to 
€xplore further the effects of a verbal-per- 
i" 
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ceptual structure upon transfer to a rote 
verbal task. The main problem investi- 
gated was whether the positive transfer 
obtained in the previous experiment was 
due to the wholeness or completeness of the 
verbal-perceptual stimulus configuration, 
or to the pairing of two types of stimuli— 
letters and pictures. A cognitive interpre- 
tation would assume that the wholeness of 
the perceptual configuration (i.e, the in- 
tegrated map) provided a structure in 
which each part was meaningfully related, 
and thus could be recalled and utilized at 
the time of the learning task. Alterna- 
tively, it is possible that the pictorial 
characteristics of the map simply pro- 
vided discrete perceptual stimuli which, 
added to the discrete verbal stimuli, pro- 
vided stronger but still not necessarily 
integrated stimulus learning at the time 
of prefamiliarization. The latter alterna- 
tive permits an S-R interpretation of the 
previous results by stating that the indi- 
vidual verbal stimuli were learned better 
when associated with discrete picture com- 
ponents during prefamiliarization than 
when presented alone, and thus the 
presence of discrete pictorial stimuli—and 
not the wholeness of the total map struc- 
ture—was responsible for the transfer ob- 
served. 

To evaluate these alternatives, the 
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present experiment compared groups which 
received prefamiliarization with a single, 
integrated map structure with other groups 
receiving the same verbal-pictorial stimu- 
lus combinations but fragmented so that 
the subject (S) was exposed to a series of 
separate word-picture stimuli rather than 
an integrated and meaningful whole. Con- 
trols received only verbal stimuli during 
prefamiliarization. The main hypothesis, 
in accordance with a cognitive theory, was 
that the integrated-map groups would 
demonstrate greater transfer to a related 
sentence-learning task than would the 
groups familiarized with the discrete word- 
picture materials, 

A second problem investigated dealt with 
retention of prefamiliarized cognitive ma- 
terial. In the initial experiment, the trans- 
fer learning task was presented immedi- 
ately following the prefamiliarization 
period. It is possible that, although the 
cognitive-map treatment yielded superior 
transfer in this immediate-memory situa- 
tion, its positive effect may have been 
due to short-term memory of the preced- 
ing stimulus configuration rather than to 
the presence of a stable cognitive struc- 
ture built up during the prefamiliarization 
period, Were this the case, the effect might 
be expected to dissipate over a short rest 
interval, in a manner similar to that dem- 
onstrated by Peterson and Peterson 
(1959) and others using different types of 
stimulus material. To test this possibility, 
in the present study a 10-minute Test 
interval was inserted between the end of 
the prefamiliarization period and the be- 
ginning of the transfer learning task, 

Finally, an attempt was made to test 
the generality of the previous findings by 
using Ss from a different age group, em- 
ploying two differing pictorial-map con- 
figurations, and administering the learning 
tasks at two different presentation rates 
by a study-test, rather than a free-recall 
method. ; 


Mernop 
Subjects 


The Ss were 36 boys and 36 girls betw. 
ages of 15 and 18, all enrolled in a summer au 
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study program conducted by Colgate University 
for high school students of above-average ability 
All Ss volunteered, and were paid a nominal stm 
for participation. For each sex, Ss were assigned {o 
one of the six experimental conditions accordi 
to a predetermined order as they appeared indi. 
vidually for the experiment. The assignment 
method provided equal numbers of boys and girly 
in each condition. 


Design 

The general design was similar to that described 
in detail in Reynolds (1966). In the first stage of 
the experiment, map sketches of varying degrees 
of meaningful structure were presented for leam- 
ing to Ss in the different groups. Following this 
first, or prefamiliarization, stage all Ss received 4 
10-minute rest interval during which they worked 
on a puzzle which was unrelated to the experimen- 
tal tasks. In Stage 2, all Ss received a rote sen- 
tence-learning task in which the sentences to be 
learned were related to the structure materials 
used in Stage 1. The main hypothesis was that a 
meaningful and integrated map structure presented 
in Stage 1 would provide S with a cognitive struce 
ture which would be retained over the 10-minute 
rest interval and would transfer positively to the 
Stage 2 learning task, whereas Ss given Stage 1 
maps which were not integrated into a sing 
meaningful context would fail to form and main- 
tain a cognitive structure and thus would demon 
strate less positive transfer to the Stage 2 task. 


Materials and Procedure 


Stage 1. Three types of materials, designating 
three levels of cognitive structure, were used 1 
Stage 1. The Cognitive Group received for study 
an 8% X 1l-inch map sketch depicting a common 
scene which contained eight parts. Each of ie 
eight parts was labeled with a consonant-vowe~ 
consonant (CVC) of 85-100% association value 
(Glaze, 1928), At a second level, the Picture Group 
received for study an 8¥% X 11-inch sheet on whid 
were depicted the same eight labeled parts me 
up the map presented to the Cognitive Gran 
these parts were separated from each other by i 
ders, so that the sheet contained eight psi 
pictures with CVC labels rather than a singe a 
tegrated scene containing eight meaningtully = 
lated parts. At a third level, the Label Cro te 
ceived for study in Stage 1 an 842 X ne i 
on which were printed the eight CVC label 
which contained no pictures. two 

At each of these three levels of structitey the 
types of task stimuli were employed. Half a 
Ss in the Cognitive Group were presente ae 
the same map used in the previous experiment, the 
shown in Figure la, This map, design te 
crossroads (CR) map, showed a modern. yaht 
intersection and its surroundings, including a 
labeled parts: an airstrip, gas station, ee 
center, diner, farm, train, trailer-truck, a0 Pare 
car. The other half of the Ss in the Cost 
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Fia, 1. Stimulus configurations presented to the Cognitive and Picture Groups during 
Stage 1 of the experiment. 


Group received for study a map showing an old 
town (OT) and its surroundings. This map, shown 
in Figure ib, contained CVC labels embedded at 
eight points in the total scene: in the town, on a 
flat plain above town, in the mountains, in the 
forest, on a lake, by a farm, by a bridge, and by a 
large castle-like building. As Figures le and 1d 
illustrate, the Picture Groups for the CR and OT 
conditions received fragmented variations of the 


CR and OT maps, these variations showing each 
labeled part in the same relative position as it ap- 
pears on the integrated map, but separated from 
other parts by a border and blank space. The CR 
and OT Label Groups received white sheets of 
paper on which were printed CVCs in the same 
relative positions as shown in Figures la and 1b, 
respectively. 

The S was told he would be shown a sheet con- 
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taining eight three-letter initials in specific posi- 
tions, and the task was to learn the initials and 
their positions. After a 30-second study period, the 
sheet was removed and S was given a test page 
which was identical to the map, picture, or label 
sheet just studied except that the initials were 
omitted. The S had one minute in which to recall 
and write on the test page all of the initials, in their 
appropriate positions. After this test, a second 30- 
second study period and 1-minute test period were 
given, and so on until S was able to write all of 
the initials correctly, in their correct Positions, on 
a single test. Thus in Stage 1 all Ss in all conditions 
learned the initials and in addition Ss in the Pic- 
ture and Cognitive Groups had opportunity to as- 
sociate these initials, respectively, with separated 
pictures or with parts of an integrated and poten- 
tially meaningful whole. 

Forgetting interval. Following a perfect test 
trial in Stage 1, all Ss were given a 10-minute for- 
‘getting interval which was filled by attempting to 
solve a letter puzzle, The puzzle consisted of a 12 
X 12-cell matrix, with the letters A-L inserted in 
the 12 cells of the left column. The S was in- 
structed to try to fill in the rest of the matrix cells 
with the letters A-L in such a manner that (a) 
no letter appeared twice in the same row, (b) no 
letter appeared twice in the same column, and (c) 
no letter appeared immediately before or after an- 
other letter more than once in the entire matrix. 

Stage 2. Following the forgetting interval § was 
seated before a Stowe memory drum set in a black 
and was given instructions for the 
Stage 2 learning task. The learning task consisted 
of eight simple sentences, each beginning with a 
CVC which had been learned in Stage 1 and end- 
ing with a word indicating an occupation. The sen- 
tences for the CR conditions were: Kor is a pilot, 
‘NuB is a gas station attendant, Bar is a shopkeeper, 
PuM is @ truckdriver, pos is a policeman, Raz is a 
cook, rer is a brakeman, and TUK is a farmer. For 
the OT conditions, the sentences were: Kor is a 
shepherd, pum is a gold miner, pos is a lumberjack, 
Raz is a tollkeeper, wep is a tailor, Fer is a wealthy 
duke, Bar is a farmer, TUK is a fishe: 

trials by 


rman. 

The sentences were presented for 10 
the study-recall method. On each trial, the sen- 
tences were first presented one at a time in the 
ing the study 


Tecall 
say aloud the occupation of each initial as it ane 


time upon learning under the various cogniti: 
conditions, half of the Ss in each condition thearildy 


dition, 8s of both sexes were distributed 
over these two levels of presentation time. eet 
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Resvurs 


Since both sexes and a range of 
levels were represented in each of the ex. 
perimental treatments, preliminary analy- 
ses were made to determine possible effects 
of these S variables upon the data. Mean 
ages for each condition ranged between 


15.75 and 16.16 years, a difference which 
did not approach statistical significance, _ 


To evaluate possible sex differences, the 
mean number of correct responses over the 
10 trials of Stage 2 learning was cal- 
culated for boys and girls in the Cognitive, 
Picture, and Label conditions, These 
means, pooled over presentation times and 
the CR-OT variable, were B = 66,00 and 
G = 64.83 for the Cognitive condition, B 
= 60.25 and G = 58.17 for the Picture 
condition, and B = 61.25 and G = 56.25 
for the Label condition. Although boys 
demonstrated a slight superiority over 
girls in Stage 2 performance, the highest 
t value obtained in comparing the means 
for each condition was 1.03 (p > .10). 
Therefore, the data from boys and girls 
was pooled in the subsequent analyses. 

Stage 1 learning. To determine if the 
various presentation modes had differential 
effects upon Stage 1 learning, the mean 
number of Stage 1 trials taken to leam 
all stimuli was calculated for the OT- 
Cognitive, CR-Cognitive, OT-Picture, CR- 
Picture, OT-Label, and CR-Label groups. 
The means ranged from 2.33 to 2.92 trials, 
and did not differ significantly from each 
other, F < 1.00, df = 5/66. These data 
indicate that the Stage 1 learning 
was relatively easy, and that neither the 
varying structure treatments nor differ 
ences in task stimuli (CR vs. OT) Pl 
duced differences in speed of learning the 
initials, 

Stage 2 learning. Table 1 presents means 
and SDs of number of correct shi 
over the 10 trials of Stage 2 learning 1° 
all groups. A preliminary test for Loy 
neity of variance indicated that differencés 
in variance were not significant, Pus 7) 
17.47, dj = 12/5, p > .05. Ina 3 x 2 en 
analysis of variance, no significan of {he 
ference was found in comparison 
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TABLE 1 
Means AND STANDARD Duyrations or Torat Correct Responses over 10 Trrats 
or Stace 2 Learnine 
2 seconds 3 seconds 

Groups oT CR or cR 
M SD M SD M SD Mu SD 
Cognitive 61.00 3.90 66.50 10.95 68.33 4.89 67.50 8.36 
Picture 51.83 12.48 50.33 16.00 63.17 10.38 71.50 5.96 
Label 49.00 16.30 53.17 14.95 62.33 10.76 68.50 7.06 


OT and CR conditions, F = 1.99, df = 
1/60, p > .05. The difference between the 
2-second and 3-second presentation rates 
was highly significant, as expected, F = 
20.15, df = 1/60, p < .001. Also, as 
hypothesized, a significant difference was 
obtained for the main effect of cognitive 
structure, F = 3.41, df = 2/60, p < .05. 
The F value for the Presentation Time x 
Structure interaction was 2.11, p > .05, 
and F’s of all other interactions were less 
than 1.00 

Inspection of the individual group means 
in Table 1 shows clearly that performance 
of the Cognitive condition was superior 
to the performance means of the Picture 
and Label conditions at the 2-second pres- 
entation rate. At the 3-second presenta- 
tion rate, however, the superiority of the 
Cognitive Group is less evident, and in 
fact is reversed among those groups receiv- 
ing the CR task. The performance of the 
Cognitive groups was consistently high 
regardless of the rate at which the Stage 
2 task was presented, whereas the Picture 
and Label groups learning the Stage 2 task 
at the 3-second rate were distinctly supe- 
rior to their counterparts who learned at 
the 2-second rate. The consistently high 
performance of all groups at the 3-second 
rate suggests that the Stage 2 learning 
task was relatively easy and was learned 
to a near-ceiling level at the slower rate 
regardless of Stage 1 treatment, thus mask- 
ing differences which showed up clearly 
when the task was made more difficult by 
decreasing the total learning time. 

In view of these subtle differential ef- 
fects of time upon performance, further 
statistical analyses were made to compare 


the effects of structure separately at each 
presentation rate. Since no significant dif- 
ference between the CR and OT tasks was 
found in the first analysis, these tasks 
were pooled for the three structure condi- 
tions at each rate. Variation due to the 
simple main effect of structure at each 
level of rate was then evaluated, using 
the error term from the original analysis 
of variance according to a procedure de- 
scribed by Winer (1962, pp. 256-257). The 
results indicated a highly significant dif- 
ference among the Cognitive, Picture, and 
Label conditions at the 2-second level of 
presentation, F = 5.35, df = 2/60, p < 
01, but no significant difference among 
the structure conditions at the 3-second 
level, F < 1.00, df = 2/60. Subsequent 
individual comparisons among the struc- 
ture conditions at the 2-second rate, using 
the Neuman-Keuls procedure (Winer, 
1962, p. 238), demonstrated that the mean 
for the Cognitive condition was signifi- 
cantly higher than both the Picture and 
Label group means, p < .05, but that the 
latter two means did not differ reliably. 


Discussion 


The performance of the Cognitive con- 
dition on the Stage 2 task was superior to 
both the Picture and Label conditions 
under all treatments except CR-3-seconds, 
and the differences obtained were statisti- 
cally significant both in an overall test 
and at the 2-second level of presentation 
rate, These results, obtained with varying 
materials and a 10-minute rest interval 
between the Stage 1 and Stage 2 tasks, 
confirm and extend those of the previous 
study (Reynolds, 1966), suggesting that 
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the assumed cognitive structure imposed 
by prefamiliarization with a meaningful 
structure has relatively stable and general 
positive effects upon later learning. The 
finding of a general superiority of the 
Cognitive treatment over the Picture treat- 
ment also confirms the main hypothesis 
that it is the wholeness or integration of 
the total verbal-perceptual structure, and 
not simply the presence of associated 
verbal and pictorial material regardless of 
structure, which facilitates transfer to a 
related learning task. 

The failure to obtain a statistically 
significant superiority of the Cognitive 
treatment at the 3-second rate of pres- 
entation requires further explanation, 
since it appears to detract from the clarity 
of the results. Table 1 shows that at the 
8-second rate, all groups achieved means 
over 62, and four of the means were over 
67. Since the highest possible score on the 
learning task was 80 correct responses 
over the 10 trials, it seems reasonable to 
consider that all of the groups receiving 
the slower rate were performing at or near 
the ceiling level for the task, and con- 


sequently significant differences among 
observable at this 


treatments were not 

rate. Alternatively, the 2-second rate made 
the learning task more difficult, allowing 
differences among the structure treat- 
ments to become evident. The data indi- 


cate that under this more difficult learning 


Reynows 


condition the performances of the Pi 
and Label groups drop considerably, while 
those of the Cognitive groups are main. 
tained at near-ceiling level, 
Thus it appears that the failure to ob. 
tain significant results at the 3-second rate 
was due to the ease with which materials 
were learned at that rate, and does not 
constitute a contradiction of the original 
hypothesis that prefamiliarization with an 
integrated and meaningful structure 8 
tablishes a cognitive representation which 
will transfer positively to a rote learning 
task. Even so, further research using 
different Stage 1 conditions and more dif- 
ficult Stage 2 learning tasks is desirable 
before a full specification of the effects of 
cognitive structure upon rote learning can 
be made. 


REFERENCES 


Gaze, J. The association value of nonsense syl- 
lables. Journal of Genetic Psychology, 1928, 35, 
255-269. 

Pererson, L. R., & Perenson, M. J. Short-term re- 
tention of individual verbal items. Journal of 
Experimental Psychology, 1959, 58, 193-198. 

Reynotps, J. H. Cognitive transfer in verbal leart- 
ing. Journal of Educational Psychology, 1986, 

> i 

Totman, E. C. Cognitive maps in rats and men. 
Psychological Review, 1948, 55, 189-208. ~ 1 

Winer, B. J. Statistical principles in experimenta 
design. New York: McGraw-Hill, 1962. 


(Received April 14, 1967) 


Education is a happening 
and here’s what’s happening in education 


EDUCATIONAL PSYCHOLOGY 

A Cognitive View 

By Davin P. AususeL, The Ontario Institute for Studies in Education and the University of Toronto 
The basic premise of this book is that educational psychology is primarily concerned with the nature, 


conditions, outcomes, and evaluation of classroom learning. 
May 1968 640 pp. $8.95 (tent.) 


CHILDREN 
Readings in Behavior and Development 


Edited by Ettis D. Evans, University of Washington 
Designed to accompany Children: Behavior and Development, Second Edition by Boyd R. McCandless 
(Holt, Rinehart and Winston, 1967), this readings book is keyed to the chapter-by-chapter organiza- 
tion of that text, but also contains many readings which take up topics not mentioned in the text. In 
selecting the readings, the editor has emphasized research reviews, analytical papers, and research 
papers carrying implications for childrearing and childhood education. 
April 1968 576 pp. $5.95 paper 


LEARNING, LANGUAGE, AND COGNITION 
Theory, Research, and Method for the Study of Human 
Behavior and Its Development 


By ArTHurR W. Staats, University of Hawaii 


In this, the most comprehensive account available of learning and its relationship to the analysis of 
language, Professor Staats demonstrates that the principles and methods of experimental psychology— 
especially those of the psychology of learning—are also the building blocks from which to construct 


a general concept of human behavior. 

January 1968 640 pp. $9.50 

THE PSYCHOLOGY OF HUMAN GROWTH 

AND DEVELOPMENT 

Second Edition 

By Warren R. BALLER, United States International University, 

and Don C. Cares, Iowa State University 
U d rewritten to meet the needs of today’s students, the second edition of this popular book 
ee ane chapter on thinking which covers the work of Jean Piaget and his Geneva school, and 


summarizes the principal contributions of American and British researchers. 
480 pp. $8.95 


Holl, Rinehart and Winston, in. 
'383-Madison Avenue, New York, New York 10017 


March 1968 


[Fp] Important Books for Educators 


An original paperbound series . . . 


CURRENT ISSUES AND RESEARCH IN EDUCATION 
General Editor: Harry L. Miller, Hunter College 


THE PSYCHOLOGY OF EDUCATION 
Current Issues and Research 
Edited by Donald H. Clark, Hunter College 


Contains 52 selections concerning the culturally deprived, intelligence testing, 
creativity, emotional growth, ability grouping, the psychology of reading, 
national education assessment, and the classroom climate. 

1967 288 pages paper, $2.95 


ELEMENTARY EDUCATION 

Current Issues and Research 

Edited by Maurie Hillson, Rutgers University 

Depicts the changing character of elementary education and includes discus- 
sions of emerging trends in foreign-language teaching, basic reading materials, 
mathematics, technology and computerized instruction. 

1967 330 pages paper, $2.95 
EDUCATION FOR THE DISADVANTAGED 

Current Issues and Research 

Edited by Harry L. Miller, Hunter College 

Forty-nine selections on the world of the socially disadvantaged child stress 
urban and “inner city” problems in the schools aad the increasing use of the 
behavioral sciences in dealing with these problems. 

1967 288 pages paper, $2.95 
SOCIAL FOUNDATIONS OF EDUCATION 

Current Issues and Research 


313 pages paper, $2.95 
SCHOOL CHILDREN IN THE URBAN SLUM 
Readings in Social Science Research 
Edited by Joan I. Roberts, Hunter College 


The articles and excerpts present significant findings from anthropology, psy- 
chology, and sociology concerning problems in urban schools. Subjects given 
special attention include ethnicity. face, and socioeconomic factors in terms 0 
how they affect intellectual Potential, learning capacity, motivation, self-con- 


cept, and personality. 
1967 * 640 pages $7.50 


| 


from The Free Press 


Fp|— 


THE COUNSELING OF COLLEGE STUDENTS 
Function, Practice, and Technique 

Edited by Max Siegel, Brooklyn College 

Foreword by Harry D. Gideonse 

The Counseling of College Students approaches college counseling from a practical 
point of view, concerning itself with the everyday realities of the counselor's 
world. With original contributions by 18 authorities in the field, this volume 
covers every significant aspect of student counseling. 

January, 1968 487 pages $9.95 


POLICY ISSUES IN URBAN EDUCATION 

Edited by Marjorie B. Smiley and Harry L. Miller, both of Hunter 
College 

The editors have chosen 28 readings for their relevance to policy formation on 
such issues as what should be taught in the “inner city” school, what changes must 
be made in teaching, and the need to remedy racial imbalance in urban schools. 
Among the authors represented are Frank Riessman, James S. Coleman, Irwin 
Katz, Whitney M. Young, Jr., and Kyle Haselden. 

April, 1968 512 pages paper, $3.50 


EDUCATION IN THE METROPOLIS 

Edited by Harry L. Miller and Marjorie B. Smiley, both of Hunter 
College 

Depicts the background of the disadvantaged urban child, confronting the 


reader with the human conditions and social implications necessary to an under- 
standing of urban school problems. In addition, the book contains twelve pages 


of photographs which illustrate aspects of the environment described. 
1967 Tico 303 pages paper, $2.95 


TEACHING THE TROUBLED CHILD 
By George T. Donahue and Sol Nichtern , 
“An interesting and well-written book in which [the authors] explain how 
‘teacher-moms’ can help seriously disturbed school-age children... a C 
lenging and provocative approach.”"—Childhood Education. A Free Press Paper- 
back. 


222 pages $2.45 
Rime Ny rH eRe Be ca A ack MeO Oh AN ols 2 el I 
THE FREE PRESS 


A Division of The Macmillan Company 
866 Third Avenue, New York, N. Y. 10022 


WHAT IS DIFFERENT? 


JOHN P. DE CECCO’S 


LEARNING AND INSTRUCTION: 


Dr. DeCecco of San Francisco State 
College, has carefully designed this 
book, with its systematic interrelation- 
ship of topics around a model of teach- 
ing, to be-a maximally efficient and 
ereaie teaching instrument in its own 
right. 

To make the study of the book profit- 
able to students, the author Provides 
lists of objectives at the beginning of 
each chapter, chapter section reminders 
of the points students should review, 
and questions within the text with fully 
explained answers. 

Following closely Robert Gagné’s clas- 
sification of learning conditions, the 
author views the teacher as an inter- 
vener and arranger of appropriate 
learning conditions, The book gives 


THE PSYCHOLOGY OF 
EDUCATIONAL PSYCHOLOGY 


re than traditional coverage to— 
theories and models of teaching, ine 
linguistic and cognitive developed: 
of the average and Uisedven 
child, the motivational functions of bp 
teacher, the psychology of school lt 
ing as the basis for planning atest 
linguistics and psycholinguistics, fi 
the current educational Jonovations. 1} 
educational technology and the Lita 
science and mathematics, the const 
tion and use of classroom and stan 


‘ardized test scores, and the evaluation 


of reports on educational research. 
May 1968, approx. 800 pp., $9.75 


i ite: 903 
For approval copies, write: Box 
Prentice-Hall, Englewood Cliffs, N. J. 
07632 


THAT'S WHAT. 


WHY A SIXTH EDITION? 


ARTHUR T. JERSILD’S 
CHILD PSYCHOLOGY 


As a leader in its field for over thirty 
years, this book has been hailed in the 
United States and in translation in many 
European, South American, Middle and 
Far Eastern countries. 

But now Arthur T. Jersild of Columbia 
University, has added entirely new 
chapters or sections on genetic factors, 
prenatal development, the influence of 
stimulation and deprivation in infancy, 
the origins of intelligence, past and re- 
cent, and research dealing with cogni- 
tive development, particularly inspired 
by Piaget. 


For approval copies, write: Box 903 
Prentice-Hall, Englewood Cliffs, N. J. 


07632 


PRI 


Other chapters such as those on the 
self, the study of dreams, anxiety, lan- 
guage development and the study of 
perception are substantially rewritten 
or revised. The book remains sensitive 
and readable. There is a continued em- 
phasis on the self as essential in under- 
standing all aspects of developmental 
psychology. 

And, there is a distinctly human touch, 
February 1968, 640 pp., $8.50 


THAT'S WHY. 


TICE-HAL 


THE PSYCHOLOGICAL FOUNDATIONS OF EDUCATION SERIES 
General Editor: Victor H. Noll, Michigan State University 


ive investigation in the field of educational psychology has resulted in the eq. 
neeehaey of a wide variety of significant topics that warrant independent consid: 
eration. This paperbound series is designed to provide a deeper examination of spi . 
areas which are not given thorough treatment in many comprehensive textbooks, 
The volumes in the Noll series offer a convenient source of up-to-date and expertly 
written supplementary material. Teachers and students will benefit from the advanoed 
coverage of pertinent subject matter. Used in conjunction with a comprehensive text, 
these books will increase the impact and effectiveness of current courses in the psycho. 
logical foundations sequence. Theoretical material is an important part of each book, 
but the emphasis is centered on practical applications. 


Psychology of Adolescence for Teachers A 
By Glenn Myers Blair and R. Stewart Jones, both, University of Illinois 
1964, 128 pages, $1.50 


Psychology of the Child in the Classroom 
By Don C, Charles, lowa State 
1964, 86 pages, $1.25 


The Psychology of Learning in the Classroom 
By Robert C. Craig, Michigan State University 
| 1966, 85 pages, $1.25 


The Mentally Retarded Child in the Classroom 
By Marion J. Erickson, Ypsilanti Public Schools 
1965, 114 pages, $1.25 


The Psychology of Discipline in the Classroom 
By William J. Gnagy, Illinois State University 
1 80 pages, $1.50 


Problem Solving in the Classroom 

By Bryce B, Hudgins, Washington University 
1966, 74 pages, $1.25 

Teacher Self-Evaluation 

By Ray H, Simpson, University of Illinois 
1966, 100 pages, $1.25 

Guidance in the Classroom 


By Ruth Strang, University of Arizona and Morris, Asst. Supervisor of Guid- 
ance and Curriculum, Lewis County, N. ¥. m9 ii 


1964, 118 pages, $1.50 


Gifted Children in the Classroom 
By E, Paul Torrance, University of Minnesota 
1965, 102 pages, $1.25 


Write to the Faculty Service Desk for examination copies: 
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THE MACMILLAN COMPANY 866 third Avenve, New York, New York 100 
In Canada, write to Collier Macnillan, 55 York Street, Toronto 1, Ontario 


Five from Macmillan 


General Psychology 
By David C. Edwards, lowa State University 


“Long overdue . . . 8 one term book such as this cannot miss. I find the book to be very 
well written, direct, and to the point.”—James Bruning, Ohio University 
This is a brief yet thorough introduction to psychology—its content, its terminology, 
and its methods. The author has accomplished brevity by carefully selecting the im- 
portant topics, and by avoiding unnecessary or irrelevant exposition and discussion of 
the obvious. The approach is modern, and is concerned with psychology as a science 
of behavior. The text is suitable for either a brief or a standard course in general psy- 
chology in which supplementary materials are used. 

1968, 384 pages, $5.95 


Statistics in Education and Psychology: A First Course 

By Merle W. Tate, Lehigh University 
Basic instruction is here provided in most of the techniques used currently in meas- 
urement and research, with emphasis on critical analysis, proper use of statistics, and 
the intelligent interpretation of results. Most of the topics are presented through 
examples which are completely developed and interpreted through the use of actual 
data. Sampling notions are introduced early and treated more rigorously in final 


chapters. 
1965, 352 pages, $5.95 


Statistics for the Classroom Teacher: 

A Self-Teaching Unit 

By Edward Arthur Townsend, Westfield State College, and Paul J. Burke, The City College of The 

City University of New York 
This supplementary text provides a review of the basic fundamentals of statistical 
concepts and procedures. The text explains the basic procedures for collecting and 
organizing data of the type most frequently encountered by the classroom teacher. 
A simple explanation of each statistical concept and process is followed by exercises 


for self-rehearsal and self-teaching, thereby saving valuable class time. 
1963, 80 pages, paper, $1.95 


The Essentials of Learning, 

An Overview for Students of Education 

Second Edition 

By Robert M. W. Travers, Western Michigan University 
Th hly updated, this edition provides unusually broad coverage of the field of 
iecrihiae ae ponent individual topics within the field in specific detail. Thus, not 
one theory of learning is presented, but many, each ig rooted in research, giving 

igni d current data on every aspect of learning. 

the student significant an a si ight 

An Introduction to Educational Research 

Second Edition 

By Robert M. W. Travers 

i vised edition of a classic text, Professor Travers provides an in- 

phim get ple and methods of educational research. The role of theory is 
emphasized throughout and distinguishes this text from others in the field. 


1964, 608 pages, $7.50 
Write to the Faculty Service Desk for examination copies 


THE MACMILLAN COMPANY 266 Third Avenue, New York, New York 10022 
In Canada, write to Collier Macmillan, 55 York Street, Toronto 1, Ontario 


Educational Psychology 


SECOND EDITION 

by LEE J. CRONBACH, Stanford University 

A leading textbook for educational psychology courses, marked by pro- 

nounced emphasis on intellectual learning and development and a full 

presentation of the psychological theory and research that underlie 

educational practice. Dr. Cronbach focuses on the necessity for under- 

standing the pupil as a person, showing how the most useful learning 

theory is one that stresses the learner’s purposes and interpretations, 

Accompanied by a Student Guide and two separate Test Item Files, 
706 pages. $8.95 


A Social Psychological 
View of Education 


by CARL W. BACKMAN and PAUL F. SECORD, 
University of Nevada 


A clear and thoroughly documented review of the concepts and data 

from social psychology which are directly pertinent to the role of the 

teacher in the classroom. The authors consider the teacher, the student, 

and the school as part of a social system, emphasizing how important 
social experience is to the educational process. 

Paperbound. 125 pages. $2.25 (probable) 

Publication: May 1968 


Learning 


by J. CHARLES JONES, Bucknell University 


This concise interpretation of recent psychological research findings 
on learning provides the teacher-in-training with a theoretical basis 
for examining procedures and problems of education. 

Paperbound. 179 pages. $2.25 


Measuring Pupil 
Achievement and Aptitude 


by C. M. LINDVALL, University of Pittsburgh 


This volume provides an introduction to the basic principles of testing 


and evaluation that are essential to teachers for the effective assessment 
of pupil achievement and aptitude. 


Paperbound. 188 pages. $2.45 
For more detailed information, please write the publisher. 


HARCOURT, BRACE & WORLD, INC. 
New York / Chicago / San Francisco / Atlanta 


COLL 
Consulting Psychologists Press presents a full range of techniques for appraising devel- 
opmental level of pre-school, nursery, kindergarten and early elementary children. 


A parent-administered device — 


@ THE SCHOOL READINESS SURVEY 


An ingenious new way to involve parents in objectively appraising their own 
children’s readiness and in expediting their children’s mental growth. 
Contains 7 easily administered and scored subtests to survey skills required in 
school, plus dozens of suggestions to parents for helping their children de- 
velop. Parent administration saves staff time and increases parental under- 
standing of their children and the learning process. 

Send $1 for Specimen Set 


A teacher-administered survey — 


@ VALETT DEVELOPMENTAL SURVEY 
OF BASIC LEARNING ABILITIES 


Contains 233 easily administered tasks from which to choose, covering 7 
areas: Motor Integration & Physical Development, Tactile & Auditory Discrimi- 
nation, Visual-Motor Coordination, Visual Discrimination, Language Develop- 
ment & Verbal Fluency, and Concept Development. For teacher use with 
children aged 2 to 7. 

Send $1 for Specimen Set (Demonstration Cards $3 extra) 


For the psychologist or specially trained teacher — 
@ MARIANNE FROSTIG DEVELOPMENTAL 
TEST OF VISUAL PERCEPTION 


For more refined evaluations of perceptual skills, this widely-used test yields 
scaled scores in 5 areas for children aged 4 to 8. 
Send $5 for Complete Specimen Set 


We distribute The Frostig Program for Development of Visual Perception and the new Pictures and Patterns, 


For the school psychologist — 
@ PSYCHOEDUCATIONAL PROFILE OF 
BASIC LEARNING ABILITIES 


test but an 8-page profile for summarizing clinical information and test 
Narada 5 basic ably areas. Aids the psychologist in discussing his findings 


with parents aoe Send .75 for Complete Specimen Set 


| For evaluating social competency of the mentally retarded — 


CAIN-LEVINE SOCIAL COMPETENCY SCALES 


A 44-item scale to estimate social competence in the trainable mentally re- 


i — -Help, Initiative, Social 

tarded, chronological age 5 13, Subscales for Self 30° 
ills, : ication. A useful aid in diagnosis, placement, and training. 
SK, 307 Uae Send $1.50 for Specimen Set 
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B CONSULTING PSYCHOLOGISTS PRESS 
‘ 577-D College Avenue, Palo Alto, California 94306 


EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 


David A. Payne and Robert F. McMorris 


For courses focusing on tests and measurements, this text contains fifty-four theoretical and 
research papers providing a broad perspective on test development with an emphasis on 
the assessment and prediction of learning outcomes. A manual of examination questions js 
available. 1967 419 pages Paper $5.75 


THE SPECIFICATION AND MEASUREMENT OF LEARNING OUTCOMES 
David A. Payne 
Intended for undergraduate and graduate courses, the text has a single purpose: to provide 
the classroom teacher with a practical and efficient set of techniques to aid in evaluating stu- 
dent achievement. Essential principles of educational and psychological measurement are 
presented along with many concrete examples. Abbreviated but valid statistical techniques 
are described with emphasis on their application for improving test analysis and interpre- 
tation, 1968 209 pages Paper $2.75 


STUDIES IN EDUCATIONAL PSYCHOLOGY 
Raymond G, Kuhlen 
For courses focusing on the psychology of school learning, this text contains fifty papers 
dealing with the learning process with special attention to the cognitive, personality, and 
motivational factors involved. 1968 482 pages Paper $5.75 

READINGS IN THE PSYCHOLOGY OF CHILDHOOD AND ADOLESCENCE 
William J. Meyer 
For courses focusing on human development, this book contains sixty-five theoretical pa- 
Pers and research reports dealing with the main features of child and adolescent develop- 
ment. 1967 436 pages Paper $5.25 

THE PSYCHOLOGY OF VOCATIONAL CHOICE 
John L. Holland 
This text offers a concise synthesis of current knowledge of vocational adjustment, and fur- 
ther presents original theory and suggestions for new research. Written primarily for stu 


dents and professional audiences, the book effectively interprets data for practical applica 
tion, 1966 132 pages Paper $1.95 


METHODS OF STUDYING THE INDIVID : 
THE PSYCHOLOGICAL CASE STUDY 


BLAISDELL PUBLISHING COMPANY 
A Division of Ginn and Company 
275 Wyman Street, Waltham, Massachusetts 02154 


[NEW] Discipline in the Classroom: Basic Prin- 
ciples and Problems 
by Staten W. Webster, 
University of California, Berkeley 
The first part of the book examines the causes 
of student malbehavior. Pointed out to the 
reader is the necessity to view acts of malbe- 
havior as products of possible interactions 
among several factors including the person- 
Ch andler ality of the student, the personality of the 
teacher, and the particular human physical 
= . environment of the classroom. 
Pu bl ications The second part of the book offers case re- 
. ports of student behavior problems. The re- 
. . ports include analyses of the problems and 
in the fields proposed suggestions for solving the prob- 
lems written by highly regarded and experi- 
= enced teachers. 
: of ed ucation Approximately 148 pages 5%” x8%" 
©1968 paper List $2.70 
Instruction: Some Contemporary Viewpoints 


and edited by Laurence Siegel, 
Louisiana State University 
psychology 


The reader is familiarized with a substantial 
body of knowledge, provided with organizing 
concepts and principles, and is exposed to 
the most recent work and thinking that has 
been done in the field of education. 

890 pages 5%2”x8%" ©1967 cloth 

List $7.00 

Motivation: Psychological Principles and Ed- 
ucational Implications 

by Melvin H. Marx, University of Missouri 

and Tom N. Tombaugh, Carleton University 
The book outlines the major efforts made thus 
far by psychologists to explain the multifold 
facts of what is commonly called “motivation”. 
The text provides students with a scientific 
explanation of the origin of such terms as ego, 
drive and aggression, while examining the 
empirical and theoretical methodology of the 
professional psychologist. 

304 pages 51%2”x8%" © 1967 paper 
List $2.95 

Learning: A Survey of Psychological Inter- 
pretations 

by Winfred F. Hill, Northwestern University 
The emphasis in this book Is on learning 
theory to provide basic understanding of the 
Jearning process. Psychological interpreta- 
tions of learning are classified into the famll- 
jar divisions of connectionist and cognitive 
and some of the attempts that have been 
made to combine the advantages of these 
two categories. 

238 pages 5%" x8%" ©1963 paper 
List $2.50 

From the Chandler Publishing Company, 

San Francisco 

Distributed by 


Science Research Associates, Inc. 
259 East Erie Street, Chicago, Illinois 60611 


A Subsidiary of IBM 


THE ONTARIO INSTITUTE FOR STUDIES IN EDUCATION 


102 Bloor Street West + Toronto 5 + Ontario » Canada 
Recent publications from the Department of Applied Psychology 


Recent Research on the Acquisition of Conservation of Substance 
Educational Research Series No. 26, 1967, pp. v + 72, $1.75 


Edited by David W. Brison and Edmund V. Sullivan. Three investigations. 
Learning Theory and Classroom Practice 

Bulletin No. 1, 1967, pp. v + 34, $1.75 

By David P. Ausubel, Examines current learning theories. 


Piaget and the School Curriculum: A Critical Appraisal 
Bulletin No. 2, 1967, pp. vii + 38, $1.25 
By Edmund V. Sullivan. Implications of Piaget’s theory of intellectual development. 


Accelerated Learning and Fostering Creativity 
1968, pp. iii + 40, $1.50 


Edited by David W. Brison. Four papers from the Phi Delta Kappa—OISE Centen- 
nial Symposium, Toronto, February 1967. 


Write PUBLICATIONS, o1sE, 102 Bloor St. W., Toronto 5, Canada 


McGraw-Hill Books 


CLASSICAL PSYCHO- 
PHYSICS AND SCALING 


By Sidney A. Manning, University of 
Maryland; and Edward Rosenstock, 
Pennsylvania State University. 

176 pages, $2.50 

This is a semi- rogrammed text de- 
signed to teac! psychophysics and 
sealing as part of a general intro- 
ductory course, or as an adjunct to 
courses in Perception, Experimental, 
or Physiological sy cholo; i 


READINGS IN PSYCHO- 
LOGICAL FOUNDATIONS 
OF EDUCATION 


By Walter H. MacGinitie and Samuel 
Ball, Teachers College, Columbia Uni- 


Available Spring, 1968 

The authors have selected articles 
that are understandable to college 
students with little or no background 
in psychology or statistics. The book 
is especially effective for use in a one- 
semester course. 


CLINICAL TEACHING 


By Robert M. Smith, Pennsylvania 
State University 

Available Spring, 1968 

In this text, the author considers the 
educational problems of the retarded 
child from the pre-school through the 
post-school periods. Suggestions are 
made for working with parents of the 
retarded within the structure of the 
school program. 


Examination copies upon request 
McGraw-Hill i Book Company 


930 West 42nd Street, 
New York, New York 10036 


new from bobbs-merrill 


THE PREDICTION OF 
ACHIEVEMENT AND 
CREATIVITY 


by RAYMOND B. CATTELL and 
H. J. BUTCHER 


The authors have developed and utilized empiri- 
cal methods to observe and analyze basic person- 
ality factors. In this new work they survey the 
present state of knowledge in the field and report 
findings of their research on the personality and 
abilities of school children. In addition, they out- 
line a specific program for testing students at key 
points during the school years, and offer sugges- 
tions for stimulating greater creativity and 
achievement within an educational system. 

“this is a to, lity pi 

seria uubtogilodea highly: Grection 
and origi thinking on the part 
of the authors” 
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| CERTAIN EFFECTS OF THE EXPECTATION TO 
TRANSMIT ON CONCEPT ATTAINMENT" 


PAUL DAVIDSON REYNOLDS* 
Stanford University 
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36 undergraduate women studied analytical concepts under 3 condi- 
tions: (a) In the Alone condition (AC), 10 Ss studied the concepts 
expecting to be tested on their knowledge of them. (b) In the Peer 
condition (PC), 13 Ss studied the concepts expecting to transmit them 
to an undergraduate woman. (c) In the Child condition (CC), 13 Ss 
studied the concepts expecting to transmit them to a 6-yr-old boy. 
All Ss, the “transmitters,” after one-way verbal communication to 
their receivers, were tested on their knowledge of the concepts. The 
Ss were divided into exceptionally high (EHM) and above average 
math (AAM) aptitude groups on the basis of their math aptitude 
scores on the College Entrance Examination Board tests, The per- 
formance scores of the AAM §s in the PC and CC were lower (p < 


.05) than the scores of the AAM Ss in the AC. 


Recent summaries of research on cogni- 


. tive processes, problem solving, and think- 


ing (Bruner, Goodnow, & Austin, 1962; 
Duncan, 1959; Gagné, 1959; Kendler, 1961; 


_ Leeper, 1951) are notable in that they re- 


flect an absence of attention to the sub- 
ject’s (S's) purpose in acquiring concepts, 
whether the individual is learning for his 
own purposes or expects to relay his newly 
acquired knowledge to another. Zajonce 
(1960) and Cohen (1961), interested in ex- 
amining the nature of cognitive structures 
that are activated or “tuned in” when per- 
sons enter into communication with others, 
have studied the effect of expecting to trans- 
mit information compared with the effect 
of expecting to receive additional informa- 
tion on the way an individual organizes in- 
formation he has received. The purpose of 
this study is to consider if the expectation 
to transmit knowledge will reduce the ca- 


; “This research was originally done in partial 
qifllment of the requirements for the master’s 
alee in psychology. The author wishes to thank 
8 advisor, Alex Bavelas, who provided the in- 
tellectual stimulation that made this investigation 
Possible and maintained the guidelines that kept 
the development of these ideas oriented in a fruit- 
ful direction. Thanks are also due to Louise R. 
Pierce. She solicited most of the subjects from the 
student, volunteer tutors who participated in the 
Stanford Tutorial Project, of which she was the 
director. Nanci Irene Moore was kind enough to 
Provide some of the original data from her re- 
Search for her master’s thesis. 
Currently with the Department of Sociology, 
Stanford University. 


pacity of an individual to acquire difficult- 
to-describe material. 

Given the basic situation of an S learning 
a concept and then transmitting it to an- 
other, the following is one possible con- 
ceptualization of the individual’s cognitive 
processes. Assume that the individual has 
two types of closely related repertoires: (a) 
A conceptual repertoire of approaches, 
strategies, logical systems, or methods of 
analysis that he utilizes when attempting to 
solve a problem or understand any type of 
subject matter. (b) A symbolic repertoire of 
signs, words, symbols, and phrases that he 
uses to code the concepts in the conceptual 
repertoire, to store information of any type, 
and to transmit concepts to another indi- 
vidual. 

This conception of a symbolic repertoire 
implies that any mental activity of the in- 
dividual, conscious or unconscious, utilizes 
symbols from the symbolic repertoire, par- 
ticularly when he uses concepts from the 
conceptual repertoire. Considering only 
those symbols the individual uses in trans- 
mitting information and ideas to another 
individual, the following can be defined: 
The transmission vocabulary consists of all 
those symbols, words, signs, or phrases that 
the individual uses to communicate with or 
to transmit to another individual. An im- 
plication of this conceptualization is that 
the symbols that compose the transmission 
vocabulary are a subset of the symbolic 


repertoire. 
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The present experiment investigates the 
effect an expectation to transmit will have 
on the learning performance of an individ- 
ual, For ease of analysis, two situations can 
be compared: (a) The learning-without- 
expectation-to-transmit situation, where the 
individual learns a subject matter with no 
expectation that he will transmit it to a 
receiver. (b) The learning-with-expecta- 
tion-to-transmit situation, where the indi- 
vidual learns a subject matter expecting to 
transmit it to a receiver. Only when the 
transmitter has completed his learning does 
he transmit his newly acquired knowledge 
to a receiver. 

Considering the two situations described 
above, there are two questions of interest: 
(a) Is the learning of the individual 
impaired in the learning-with-expectation- 
to-transmit situation as compared to the 
learning - without -expectation-to - transmit 
situation? (b) Is there a loss of knowledge 
in the transmission between the transmitter 
or “learner” and the receiver? Based on 
these questions, there are four possible cases 
of the effects of the learning-with-expecta- 
tion-to-transmit situation when it is com- 
pared to the learning-without-expectation- 
to-transmit situation. These are arrived at 
by first considering that the learning of the 
subject may not be impaired (L) or may 
be impaired (—L) and then considering that 
there is no loss in the transmission of sub- 
ject matter (T) or that there is a loss in 
transmission (—T), 

The four possible cases of the learning- 
with-expectation-to-transmit situation are 
as follows: t 

1, In the L,T case there would be no re- 
duction in $’s learning of the material when 
compared to the learning-without-expecta- 
tion-to-transmit situation. After transmis- 
a fa ‘ eae the receiver demonstrates 

owledge of the materi 
the transmitter, ee 

2. In the L, —T case there would 
duction in Ss learning of the TaeRRE 
bs Pomiien as to a receiver, the receiver 

lemonstrates less know! t 
than the transmitter, ron aco 

3. In the —L,T case there 
duction in §’s learning of the pes ak 
compared to the learning-without-expecta- 
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tion-to-transmit situation. After S’s trans. 
mission to a receiver, the receiver demon. 
strates knowledge of the material equal to 
that of the transmitter. 

4, In the —L,—T case there would be 
reduction in S’s learning of the material 
when compared to the learning-without-ex- 
pectation-to-transmit situation. After §'s 
transmission to a receiver, the receiver 
demonstrates less knowledge of the material 
than the transmitter. 

If the L,T case occurs, where there is no 
impairment of learning or transmission, it 
may be assumed that the expectation to 
transmit has no observable effect on the in- 
dividual’s cognitive processes. . 

The L,—T case, where there is no im- 
pairment of learning but there is impair-— 
ment of transmission, would imply that the 
individual used his conceptual repertoire | 
with facility in learning the subject matter, 
but that some of the symbols used in learn- 
ing the subject matter were not a part of : 
the transmission vocabulary. In other 
words, the individual was readily able to 
learn concepts and ideas, but was not able 
to express ideas that he understood. This 
reasoning would imply that the symbolic” 
repertoire includes more symbols than the 
transmission vocabulary, which is an issue | 
of some interest. 

If the —L,T case occurs, where there is 
an impairment of learning but no impair- 
ment in transmission, the reduction in learn- 
ing performance may be attributable to 
either or both of two effects: First, if we 
assume that the transmission vocabulary 
is smaller than the total symbolic repertoire, 
it may be that S, knowing he will be re- 
quired to transmit the soon-to-be-learned 
subject matter to another, will restrict his 
conceptual repertoire to those concepts that 
are represented by symbols in the trans- 
mission vocabulary. In other words, it may 
be that in attempting to understand the sub- 
ject matter the individual utilizes only | 
those concepts or schemata that he can 
transmit to the receiver with ease and con- 
fidence. 

Second, it may also be that S is not 
sure that the receiver has a conceptua 
Tepertoire that is equivalent to his and may 
restrict his conceptual “tool kit” to only » 


h self. 


} cally, 
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those concepts that he is sure that the re- 
ceiver will understand, transmission of the 
concepts being of minor concern, For ex- 
ample, the easiest way to represent algebrai- 
cally a particular geometric figure may be 
to utilize polar coordinates. But if the trans- 
mitter perceives that the receiver is only 
accustomed to rectangular coordinates, he 
may attempt to utilize the inappropriate 
rectangular coordinates and as a result, may 
fail to understand the geometric figure him- 


If the —L, —T case occurs, where there is 
an impairment of learning and an impair- 
ment of transmission, all three of the effects 
described in the two previous cases (L, —T 
and —L,T) could be occurring. 

Finlay (1966) studied the difference in 
the effect on a receiver’s performance after 


» he received written instructions from a 


transmitter who had just had an opportu- 
nity to learn a skill—arranging a 2-foot 
chain to achieve a high score. The major 
result was that if the transmitter and the 
receiver did not know in advance that the 
transmitter was to provide the written in- 
structions, the receiver did significantly 


| better on the task than if the transmitter 


and receiver did know in advance that the 
transmitter would provide written instruc- 
tions for the receiver. 

The results of the Finlay experiment al- 
low speculation that, given certain types 


| of subject matter, an individual in the 
learning-with-expectation-to-transmit situ- 
| ation will transmit less information to a re- 


ceiver than an S in the learning-without-ex- 
Pectation-to-transmit situation. In the 
above discussion of this case (L,—T) we 


| concluded that the occurrence of this phe- 


nomenon would imply that the transmission 
vocabulary was smaller than the symbolic 


| Tepertoire. 


The Purpose of the present study is to 
determine if the other possible cases, —L,T 
or —L,—T, of the learning-with-expecta- 
jion-to-transmit situation could occur 
or certain types of subject matter. Specifi- 
5 the interest is the possibility of a 

crement in the learning performance of Ss 
when they expect to transmit. The major 


) Gestion asked is: Is the test performance 


of Ss after acquiring concepts affected when 
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they expect in advance to transmit the con- 
cepts? 

If there is a reduction in the amount of 
learning that an individual exhibits in the 
learning-with-expectation-to-transmit situ- 
ation, it should be strongest when the char- 
acteristics of the receiver lead the indi- 
vidual to reduce either his transmission 
vocabulary or his conceptual repertoire. 
Therefore, an additional question will be 
addressed: Will the individual’s perception 
of the ability of the receiver to understand 
his transmission have an effect on the in- 
dividual’s test performance after acquiring 
concepts? 


Tue ConcePTuaL PROBLEM 


The concepts were analytical in nature 
and are discussed in detail in Reynolds 
(1966, Appendix II). The Ss attempted to 
determine the concepts common to each of 
four categories of designs in the learning 
set, Figure 1. The two images in each rec- 
tangle represent a single design. The colors 
(green, blue, red, and yellow) were used 
only for identification of the categories. 

If an individual has an opportunity to 
learn a subject matter, and can demonstrate 
knowledge of the material through non- 
verbal responses, the translation of ideas or 
concepts into words, or symbols from the 
transmission vocabulary, is unnecessary, 
It may be possible to assume that the trans- 
mission vocabulary is being bypassed and 
that knowledge of the subject matter is be- 
ing measured directly. For this reason, all 
Ss demonstrated knowledge of the concepts 
by sorting designs, similar to those in the 
learning set, into four categories without 
access to the learning set. 


MerxHop AND PROCEDURE 


The main factor that is expected to affect the 
learning performance of S is the expectation that 
what is learned be put into words. In addition, it 
is expected that the transmitter’s perception of 
the size and sophistication of the receiver's con- 
ceptual repertoire and transmission vocabulary 
may have an influence on the magnitude of the 
effect. To investigate the second issue, an attempt 
was made to manipulate the transmitter’s per- 
ception of the (a) experience and (b) development 
of the receiver’s conceptual repertoire and trans- 
mission vocabulary. 

For the Peer condition, it was assumed that Ss 
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Fig. 1, Task learning set. (Columns from right to left are green, blue, red, yellow.) 


would perceive another individual of like sex and 
approximately equal social and educational back- 
ground as having an equally developed conceptual 
repertoire and transmission vocabulary. For the 
Child condition, it was assumed that 83 would 
perceive a child ag having a less developed con- 
ceptual repertoire and ission vocabulary. 

Since it was impossible to test how much each § 
would haye learned if they had not expected to 
transmit, an Alone condition was used to measure 
the learning Performance of comparable Ss when 
they did not expect to transmit, 

The experiment involyed three conditions: The 
Peer condition (PC). The 8, expecting to instruct 
@ peer, studied the learning set, instructed the peer 
receiver, and without prior notice, was asked to 
take the test. The Child condition (CC). The 8, 


without prior Notice, was asked to take the test, 
The Alo 


arming set and then 
took a test of her knowledge when she indicated 
she was ready, 

Thirteen Ss experienced the PC, another 13 Ss 
experienced the CC, and 10 Ss experienced the AC. 


Subjects 


The 49 undergraduate women that served ag Ss, 
36 as transmitters and 13 as peer receivers, were 


volunteers recruited in three ways: 32 from. i 
group of unpaid tutors who were working me 
potential dropouts in public schools, 14 ie 
undergraduate courses, and ree were friends 
Ss participating in the research. 

it reid décited that it was best to make Ee 
manipulation in the case of the child receivers ' 
strong as possible without arousing the ai 
of Ss. Therefore, a first grade child was eee : 
the lowest level of perceived intellectual deve fa 
ment that was high enough so that Ss might ee 
that they had some chance of explaining sa . a 
cepts to a child receiver, Two child confe ara 
were used, both 6-year-old boys. They rely 
25 cents for each participation in the gsc 
and were fully aware of the part they were 
pected to play. 


Peer Condition 


When the two women arrived for the PC, ba 
Toles of the transmitter and the receive a t 
tandomly assigned and the receiver was aske BD 
wait outside the room while the transmitter 
ceived the following instructions: 

1. The transmitter was told that the purpose 
the study was to determine how well the The 
mitter could transfer concepts to the Tee a 

itter was told that she was the recel fe 
only source of information and that the ™ 
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measure of performance was the receiver's test 
ore. 

f 2. Using a sample task it was demonstrated that 

although the actual designs in the learning set and 

test set were different, they could be grouped on 

an abstract basis. All Ss completed the sample 

task without errors. 

8. The transmitters were told that the only 
communication allowable between them and the 
receiver would be their verbal instructions. No 
visual cues to nor any verbal utterances on the 
part of the receiver were allowed. 

4, The time limits were described in detail. 
The transmitter was to be given a maximum of 
85 minutes to study the concepts and instruct the 
receiver. This 35 minutes was divided into two 
parts, a study period which could last up to 20 
minutes during which the transmitter was not re- 
quired to say anything and an instruction period 
which lasted until the transmitter voluntarily 
terminated the instructions or 35 minutes had 
passed, The transmitter was told that the receiver 
would be given all the time she required to com- 
plete the test. 

The complete set of instructions was always 
read to the transmitter (taking about 20 minutes) 
who was encouraged to ask questions about any 
part that she did not understand. The transmitter 
and receiver were permitted to ask questions about 
procedure during the experiment. 

Following the instruction of the transmitter, 
the receiver was brought back into the room and 
seated at a table back-to-back with the trans- 
mitter. She was then given an abbreviated set of 
instructions which stated that the receiver was to 
separate cards in accordance with the instructions 
of the transmitter and that she was not allowed 
to say anything about the task or see what the 
transmitter was doing. The receiver was told that 
the focus of the study was on her test performance 
and that once she had received her test set, she 
would be allowed all the time she wanted to com- 
plete the test. 

Upon completion of the instructions, the trans- 
mitter was immediately given the learning set, 
and the procedure described to the transmitter 
was followed until the transmitter’s instructions 
to the receiver were completed. When the trans- 
mitter voluntarily terminated her instructions, or 
the 35 minutes had elapsed, the learning set was 
removed from the transmitter and the transmitter 
was given a test set and asked to separate the de- 
signs into the proper categories, taking as long as 
desired. At no time did any S appear to be 
hampered by the time limits, 

The transmitter was encouraged to be as ac- 
curate as possible. When she had finished the test, 
she was asked a series of questions about the ex- 
periment and the entire purpose of the study was 
explained to both transmitter and receiver. 


Child Condition 


Because the child receivers were confederates, 
the design of the procedure was intended to mini- 


143 


mize the contact between the transmitter and the 
child receiver until after the transmitter had 
taken the test. In the CC, the experimental pro- 
cedure was exactly the same as that in the PC 
except that the child receivers made no attempt 
to take the test. Although no transmitters were 
experiencing close and continuous contact with 
children outside the experimental situation, none 
showed any apprehension or neryousness about 
teaching a child. 


Alone Condition 


In the AC, the procedure was varied by elim- 
inating all instructions that pertained to the re- 
ceiver, retaining the remainder of the instructions, 
including the sample problem. The time limits re- 
mained the same except that it was necessary to 
put a 10 minute minimum time limit on the study 
session, while retaining the 35 minutes maximum 
time limit. The Ss in this condition knew from 
the start that they would be tested on their knowl- 
edge of the concepts. 

The last question of the postexperimental in- 
terview was: “When did you first realize that 
you were going to take the test yourself?” For all 
transmitters in the PC and CC conditions, the 
answer was, in effect, “When you handed me the 
cards,” 


Scoring of Conceptual Attainment 


Since the most important data collected are the 
scores that the transmitters achieved on the test 
of their knowledge of the concepts in the learning 
set, it is proper to consider what this score ac- 
tually means in terms of conceptual acquisition. It 
seemed appropriate to design a procedure that 
would permit a count of the number of concepts 
acquired, utilizing this figure as an ordinal measure 
of performance. The following assumptions re- 
sulted in a procedure that allowed the translation 
of the numerical score, which could vary from 0 to 
72, to a measure of the number of concepts ac- 
quired, which could vary from 1 to 4. 

Assuming that an S had no knowledge of the 
concepts and distributed the 72 test designs to the 
four categories on a random basis, it would be ex- 
pected that 18 of the designs would be correctly 
classified by chance (72/4 = 18), Level 1. 

Assuming that an S understood one concept 
well, placed 18 designs into one correct category, 
and distributed the remaining 54 designs on a 
random basis; it would be expected that 18 de- 
signs would be correctly placed by chance into the 
remaining 3 categories for a total score of 36 (18 
[from knowledge] + 54/3 [by chance] = 36), 
Level 2. 

Assuming that an S understood two concepts 
well, placed 36 designs into two correct categories, 
and distributed the remaining 36 designs on a 
random basis; it would be expected that 18 de- 
signs would be correctly placed by chance into 
the remaining 2 categories for a total score of 54 
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(36 [from knowledge] + 36/2 [by chance] = 54), 
ea that an S understood three concepts 
well, placed 54 designs into three correct categories, 
and put the remaining 18 designs into one group; 
it would be expected that a score of 72 would re- 
sult (54 [through knowledge] + 18 [by default] = 
72). Finally, assuming that an S understood all 
four concepts, it would be expected that all 72 
designs would be correctly placed into the four 
categories for a score of 72. Since it is impossible 
to separate these two levels of conceptual attain- 
ment through examination of the test score alone, 
both of these occurrences are considered Level 4. 

The procedure used for making inferences about 
the number of concepts attained by Ss from their 
performance score was as follows: The Ss who 
scored between 0 and 27 were placed in Level 1. 
The 8s who scored between 28 and 45 were placed 
in Level 2. The Ss who scored between 46 and 63 
were placed in Level 8. The Ss who scored between 
64 and 72 were placed in Level 4. There was no 
problem in classifying scores with this procedure 
since, with one exception, 8s’ scores were clustered 
around 36, 54, or 72. 

After translating Ss’ scores into levels of con- 
ceptual attainment, it is appropriate to consider 
this an ordinal measure of performance, Siegel 
(1956, p. 136) describes the Kolomogoroy-Smirnoy 
two-sample test and a chi-square approximation 
for small unequal sample sizes which appears to 
meet the assumptions of the ordinal measure of 
conceptual attainment, 


Resvuts 


Comparison of the transmitters’ per- 
formance level scores across the three ex- 
perimental conditions shows no statistically 
significant differences and no clear direc- 
tion of improvement, although the best per- 
formances were in the Alone condition, 


the test concepts appeared to be 
analytical in nature, it 
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average math” (AAM) group of 16 $3 with 
scores ranging downward from 650. 

The performance scores of the trans. 
Initters, after they had been classified into 
the two groups, is shown in Table 1. The 
change in performance among the AAM 
transmitters is statistically interesting and 
indicates that the expectation to transmit | 
may have an effect on the learning per- 
formance of the transmitter, for the AAM 
transmitters in both the PC and CC have 
lower performance scores than the AAM 
transmitters in the AC. 

If the same procedure is followed using 
the verbal aptitude scores of the transmit- } 
ters, the results among the six cells shows 
the same trends as found with the separa- | 
tion based on the math aptitude scores, But 
the statistical significances are much lower, 
This similarity may be due in part to the 
high correlation between the math and ver- 
bal aptitude scores. Using a simple linear 
Tegression to obtain a Pearson r and utiliz- 
ing 50 undergraduate women on whom data 


TABLE 1 


Noumper or Transmrrrers Arrarnine Eacn 
Prrrormancs Leven 


SSS 
Transmitters’ Math Aptitude 
Peaecater ya 
Pedi | Pectorm Above average | ts "a, 
level cae 
attained 
[Number] % |Number| % 
BBs 63 "86151 ped ac 
Alone 4 4 100 | 3 50 
3 4 | 100] 5 | 85 
2 4 | 100] 6 | 100 
1 4 | 100] 6 | 100 
Peer 4 0 o| 6 67 
3 3 75 | 9 | 100 
2 4 | 100] 9 | 100 
1 4 | 100] 9 | 100 
Child 4 2 25 | 5 | 100 
3 5 | 63] 5 | 100 
2 6 75 | 5 | 100 
1 8 | 100 | 5 | 100 
oops ts Ts a Bc 


Note—Using a one-tailed chi-square approxi- 
mation of the Kolmogorov-Smirnov two-sample 
test for unequal sample sizes, the following just 
Significant levels result: AAM Alone vs. < 
Peer: p < .02; AAM Alone vs. AAM Child: p < 
a; AM Peer vs, AAM Child: p < 40, regardless 
of direction; EHM Peer vs. AAM Peer: p < .083 

Child vs. AAM Child: p < .03. 
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were available, the correlation between the 
two aptitude scores was .61 (p < .0001). 
It will be remembered that a naive S was 
used for a receiver in each trial in the PC. 
The peer receivers’ test scores were classi- 
fied into four levels on the same basis as 
the test scores of the transmitters. The 
scores of the peer receivers were then com- 
pared to the scores of the transmitters, 
comparing the scores of the EHM transmit- 
ters to the scores of their receivers and the 
scores of the AAM transmitters to the 
scores of their receivers. Again utilizing the 
Kolmogorov-Smirnov two-sample test, it 
was apparent that there was no difference 
between the test performances of the trans- 
mitters and receivers. In fact, only two out 
of 13 receivers failed to equal the perform- 
ance level of their respective transmitters 
and these two were only one performance 
level lower. The results are the same if the 
comparison is made on the basis of the 
transmitters’ verbal aptitude scores. 


Discussion anp ConcLusion 


The major issue that has been investi- 
gated is whether the expectation of trans- 
mitting a concept to another individual will 
affect its acquisition by the transmitter. 
The framework proposed earlier hypothe- 
sized that an individual might have a con- 
ceptual repertoire that is used for learning 
and problem solving (utilizing symbols 
from a symbolic repertoire) and a trans- 
mission vocabulary (that may be a subset 
of the symbolic repertoire) that is used for 
interpersonal communication. Since it was 
proposed that concepts difficult to describe 
might cause the greatest decrement in con- 
ceptual attainment, a task was designed 
that was difficult to verbalize but which al- 
lowed objective measures of performance. 

The experimental design attempted to 
measure, in a control condition, the ability 
of Ss to acquire the concepts without the 
distraction of expecting to transmit. This 
condition was compared with two others: 
one where S might perceive the receiver, a 
scholastic peer, as having an equally de- 
veloped conceptual repertoire and transmis- 
sion vocabulary, and another condition 
where the transmitter might perceive the re- 
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ceiver, a 6-year-old boy, as having a less 
developed conceptual repertoire and trans- 
mission vocabulary. 

A major problem resulted from the lack 
of variation in the dependent variable, the 
transmitters’ test scores. To prevent sus- 
picion of the transmitters in the Child Re- 
ceiver condition, the task was designed so 
that the transmitters might infer that a 6- 
year-old could understand the concepts. 
However, to provide variation in the trans- 
mitter’s performance, it is desirable to have 
a task difficult for S population. As most 
transmitters (20 out of 36) attained maxi- 
mum scores on the test, it would appear 
that this task was easy for the population 
from which the transmitters were drawn. 

After the transmitters were divided into 
two groups on the basis of their math apti- 
tude scores (considered an independent 
measure of their ability at the task), the 
transmitters with above average math ap- 
titude scores in the learning-with-expecta- 
tion-to-transmit situations demonstrated 
significantly less learning than those in 
the learning-without-expectation-to-trans- 
mit situation. 

In the Peer condition, the peer receivers 
had test scores equal to those of the trans- 
mitters. The Peer condition then corre- 
sponds to the —L,T case described earlier. 
Assuming that the transmitters perceived 
that the receiver’s conceptual repertoire 
and transmission vocabulary were equal to 
their own, it may be inferred that the very 
act of expecting to transmit led the trans- 
mitters to reduce their conceptual “tool 
kit” and utilize only those concepts they 
could transmit with ease and confidence in 
learning the subject matter. Given these as- 
sumptions, it may be concluded that the 
transmission vocabulary is significantly 
smaller than the symbolic repertoire. 

Examining the results from the Child 
condition, there is only tentative evidence 
that there may be a greater potential for a 
reduction in learning than in the Peer con- 
dition. Considering the exploratory nature 
of this research, it would appear that this 
situation deserves further attention. 

One important unanswered question is 
the effect on an S in a learning-with-ex- 
pectation-to-transmit situation when the 
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receiver is perceived as having a more 
sophisticated and developed conceptual re- 
pertoire and transmission vocabulary. If 
there is a decrement in learning in this 
situation, then the evidence is very strong 
that a reduction occurs in the transmitter’s 
conceptual tool kit and that it is mediated 
through the transmission vocabulary. 

Another major issue is the possible in- 
teraction effect between the expectation to 
transmit and the expectation to be tested. It 
will be recalled that the Alone Ss expected 
to be tested and the transmitters, who ex- 
pected to transmit, did not expect to be 
tested, In essence, this issue concerns the 
effect on the transmitters when they ex- 
pect to be evaluated on both their own 
test performance and the test performance 
of the receiver, 

It should be mentioned that in the Finlay 
(1966) experiment the transmitters in the 
learning-with-expectation-to-transmit situ- 
ation did not demonstrate a decrement in 
learning performance, Finlay’s major re- 
sult was the effect on the receiver's per- 
formance, which was Significantly lower in 
the _ learning-with-expectation-to-transmit 
situation, Her result corresponds to the 
L,—T case, discussed earlier, while the 
present results appear to correspond to the 
—L,T case, It may be that this difference 
is attributable to the radically different 
types of material that the transmitters 
were asked to learn in the two experiments, 


Pau Davison ReyNnoips 


a skill in the Finlay study and difficuly to 
describe concepts in the present study, 
This raises another issue that deserves ai. 
tention: What are the characteristics of 
subject matter that lead to these different 
effects in the learning-with-expectation-to. 
transmit situation? 
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EFFECTS OF RULES OF THUMB ON TRANSFER OF TRAINING? 


TRACY H. LOGAN* ayo KENNETH H. WODTKE 
Pennsylvania State University 


A rule of thumb was added to an instructional program designed to 
facilitate transfer to problems by means of a more general principle. 
The presence of the rule of thumb produced a marked decrement in 
performance on the transfer tasks. Only 20% of the students who were 
given the rule achieved perfect scores on a transfer test, while 75% 
of the no-rules group achieved perfect scores. The transfer decrement 
occurred in spite of the fact that the students were given several 
didactic warnings indicating that the rule would not apply on the 
transfer problems. The poor performance of the rule-of-thumb groups 
may have resulted from their misuse or overgeneralization of the 
tule. The results were discussed in terms of the classical negative 
transfer paradigms and the effects of a persistent set on problem 


solving. 


A number of previous experiments 
(Hendrickson & Schroeder, 1941; Hilgard, 
Irvine, & Whipple, 1953; Judd, 1908; 
Overing & Travers, 1966) have demon- 
strated that knowledge of a relevant prin- 
ciple facilitates transfer to problems which 
involve an application of that principle. 
Each of the above experiments have in 
common the fact that the principles taught 
were applicable to all of the transfer prob- 
lems _employed in the experiment. The 
principle had no exceptions. This situation 
18 somewhat atypical of many instructional 
situations in which a number of principles 
of varying generality must be taught. The 
student’s task is complicated by the need 
to learn exactly when each of several prin- 
ciples is applicable. Such a learning situa- 
tion would seem to provide many op- 
portunities for negative transfer. 

A negative transfer situation may arise 
when rules of thumb® for solving problems 


*The writers wish to express their appreciation 
to Bobby R. Brown who assisted in the data analy- 
Sis, and to William Rabinowitz for commenting 
on an earlier draft of the paper. Partial support for 
the study was obtained under a United States Of- 
fice of Education Project No. 5-85-074. The study 
was completed while the first author was on a Na- 
tonal Science Foundation Faculty Fellowship. 

pes at Wabash College, Crawfordsville, In- 
a. 

*The terms “principle” and “rule” have not, in 
the writers’ judgment, been carefully defined in the 
literature. The question of what is learned when a 
student learns a rule or principle has not been ade- 
quately answered, and progress in this area will be 
slow until an adequate answer to this question is ob- 


are taught. In the present discussion a rule 
of thumb is simply regarded as a principle 
having only very limited generality. Such 
rules are often taught in conjunction with 
a more general principle. They are quite 
common in subjects such as mathematics, 
statistics, and science, and are justified as 
shortcut, time-saving procedures. The great 
difficulty with most rules of thumb is that 
they are usually only applicable to a very 
limited class of problems. Due to its 


tained. The learning of a rule may consist of the 
chaining of a series of implicit or explicit verbal 
mediating responses which become linked to a cer- 
tain class of problems. In problem solving, the proc- 
ess may be analogous to a series of if-then state- 
ments such as “If I am confronted with problem 
type X, then rule Y applies.” The rule itself may be 
a relatively simple and straightforward statement 
of how to arrive at a solution to the problem as in 
most rules of thumb, or it may be a complex chain 
of mediating responses such as, “if A is true, then B 
must be equal to C, and therefore D equals... , 
etc.” Rules may differ along a number of dimen- 
sions, thus, rules may vary in their generality or the 
variety of problems to which they apply, they may 
vary in the complexity of the mediational chains 
involved in applying the rule, or they may vary in 
their meaningfulness. Typically, the term “rule” is 
used when generality is limited and when the me- 
diating response chains are relatively simple and 
straightforward. The term “principle” usually im- 
plies greater generality and greater complexity of 
if-then chaining. In the present study, the term 
“principle” is used to refer to a problem solving 
process which has wide generality, and which is 
fairly complex in the mediational processes in- 
volved. The term “rule of thumb” is used to refer 
to a rule which has less generality, and which in- 
volves much simpler if-then chaining. 
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limited generalizability, a rule of thumb 
can rarely mediate transfer from an origi- 
nal set of problems to a new set of prob- 
lems. Much more generally applicable 
principles are needed to facilitate transfer 
to new problems. In spite of its limited 
generality, however, a rule of thumb may 
be invoked in the transfer situation be- 
cause of the student’s tendency to overgen- 
eralize its use, and his failure to recognize 
its exceptions. Experience in the classroom 
often testifies to the fact that students ap- 
pear to have a strong set to “blindly” ap- 
ply a rule of thumb regardless of its ap- 
plicability. 

Schulz (1960) has suggested that the 
same principles which govern transfer in 
verbal learning may apply to some aspects 
of problem solving, If a general principle 
and a rule of thumb were both associated 
with a common class of problems, previous 
research on transfer would lead to the pre- 
diction that each would tend to generalize 
to similar problems on a transfer task. Any 
generalization of the limited rule of thumb 
would lead to incorrect solutions on any 
transfer problems to which it did not ap- 
ply. The interference of the rule of thumb 
with the application of the more general 
principle would be expected to increase 
when the transfer problems were highly 
similar to the training problems, and when 
students were unable to determine the 


1964; 


The present experiment sought to 
vide information on tw i si 
to the above expectations: Ra oc 

1. What is the effect on transfer of jn- 
cluding 4 tule of thumb having on} 
limited generality in an instructional ay 
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gram designed to teach a more 
principle? 

2. Will additional opportunities to prac- 
tice using the general principle reduce the 
amount of negative transfer resulting from 
the knowledge of the rule of thumb? 


Meruop 


genera] 


Subjects 


The subjects (Ss) for the investigation consisted 
of 79 students from an introductory educational 
psychology class who volunteered for the experi- 
ment, and were given course credit for their partici- 
pation. The experiment was conducted during the 
spring term, 1966. Although Ss represented a variety 
of different majors, a large proportion were enrolled 
in the teacher education program. 


Materials 


The topic of instruction chosen for the investi- 
gation was the concept of significant figures at the 
level of high school or introductory college physics, 
The concept of significant figures was especially 
suitable for a study of the effects of rules of thumb 
on transfer, since teachers and textbooks typically 
employ several rules and principles in teaching the 
concept. 

Instruction was presented to Ss in the form of a 
self-instructional programmed text. There were 

versions of the program corresponding to each 
of three treatment conditions. Version 1 consisted 
of 37 main-trunk frames, remedial frames, and 
several practice probléins designed to teach Ss the 
general principle of significant figures. The Ss were 
taught the basic reasons for the loss of significance 
in figures. The Ss were shown that when a measure- 
ment is taken, the accuracy of that measurement is 
Testricted by the limits of the scale’s graduations, 
Thus, in a measurement of 9.53 centimeters taken 
from a typical meter stick, the figure 3 representing 
8 hundredths of a centimeter is usually estimated 
visually since the meter stick is not graduated in 
hundredths of a centimeter. Since different investi- 
gators are likely to vary in their estimates of such 

8, the results of calculations involving these 
estimates will also vary. For example, the following 
Problem illustrates the errors that are likely to be 
made, and the procedure for arriving at the “signifi- 
cant” figures assuming that the error in the esti- 
mated value is +.01. The problem is to find the area 
of a rectangle whose sides are 9.53 and 8.67 centi- 
meters long. One finds the largest and smallest val- 
ues the length and width could have and from these 
the largest and smallest values the area could have, 
for example: 


954 X 8.68 = 82.3072 
953 X 8.67 = 82.6251 


952 X 8.66 = 82.4432 
From these values it is seen that the only value 


. 
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degree of confidence can 
figure. Since the tenths place was only partially cer- 


marked to 
figure. The above reasoning applies to all calcula- 
tions involving measurements and therefore has 
wide generality. 

Allinstruction was provided within the context of 
multiplication problems. Following the basic in- 
struction on multiplication problems, Ss’ ability to 
transfer this basic reasoning to addition and trigo- 
nometry problems was measured. 

Version 2 of the program was exactiy like Ver- 
sion 1 except that immediately following the mate- 
rial on the basic principle of significant figures, Ss 
were given the rule-of-thumb section which in- 
cluded the following frame plus several illustra- 
tions of its application: 


A Quicker Way For Determining Which Figures 
In A Calculation Are Significant. 


The method you have used is excellent, but a 
bit slow. Here is a faster way. It is just a rule of 
thumb, and works only for products and quo- 
tients, not for addition. 

Rule: When multiplying or dividing, the re- 
sult has just as many significant figures as the 
factor with the fewest significant figures. 


referred to as the “rule-ear! ” 
program since the rule of thumb was given im- 
mediately following instruction on the basic 
principle, but prior to a short practice segment. 
_ The third version of the program was exactly 
like the second version except that the rule of 
thumb was introduced at the very end of the 
program following the short practice segment. 
Version 3 is referred to as the “ule-late” version. 
The practice segment was designed to provide 
extra practice using the general principle prior to 
the introduction to the rule of thumb. Thus, the 
only difference between Versions 2 and 3 of the 
Program was the placement of the rule of thumb 
in the instructional sequence; either before oF 
after a short practice segment. Table 1 summarizes 
the instructional sequences in the three experi- 
mental treatment conditions, and the associations 
which the instructional segments had been de- 
signed to teach. 

The two rule-of-thumb versions of the pro- 
gram also contained several explicit warnings 
concerning the exceptions to the rule of thumb. 
Upon introducing the rule of thumb, Ss were told: 
Tt is just a rule of thumb and works only for 
products and quotients, not for addition.” After 
introducing the rule, Ss were told: 


This version is 


there are occasional 


...88 with many rules, ec e 
jncorrec' 


exceptions where the rule gives a0 
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answer. Therefore you are strongly advised to 
check any rule result by using the basic reason- 
ing of significant figures until you get a feeling 
for when the rule works and when it doesn’t— 
say, at least for the next week or 80. 


Immediately preceding the transfer test Ss were 
again told: “ though you haye not practiced 
these, you can reason them out. Just trust your 
brain.” In addition to these cues, the rule of 
thumb itself contains a cue to the limits of its 
applicability, thus, the rule states, “When mul- 
tiplying or dividing, the result has just as many 
significant figures as the factor with the fewest 
significant figures.” (The italics were not included 
jn the experimental program.) 


pretest contained five problems designed to assess 
Ss’ prior mowledge of significant figures. The 
posttest contained five multiplication problems 
and eight 


problems, and three trigonometry problems. The 
transfer problems could not 
by the simple application of the 
but they could be 
principle of significant 
in Stage 1 


groups, respectively. The reliabilities at the three- 
item trigonometry transfer test were 58, 53, and 
groups, respec- 
tively. Some of the reliabilities are lower than those 
usually obtained for mathematics achievement 
tests. Although the lower reliabilities were in part 
due to the small number of items making up the 
subscores, the range of performance was 
severely restricted since many 8s achieved perfect 
scores. 


Procedures 


A pool of 162 Ss in an introductory educational 
class were given the significant figures 
pretest. "Twenty-five Ss were uninterested in par- 
ticipating. Of the remaining 8s, only those whose 
pretest performance indicated minimal knowledge 
of significant figures (maximum scores of zero or 
one on addition and never more than two on mul- 
tiplication with zero on addition) were asked to 
volunteer for the experiment. Seventy-nine stu- 
dents served as experimental Ss. The Ss were 
assigned at random to one of the three experi- 
mental conditions. Approximately 1 week after 
the administration of the pretest, Ss reported to 
a large classroom to complete the instructional 
program and the posttest. Two groups were tested 
‘on two successive evenings. Upon arriving at the 
experimental room, ‘8 was given the version 
of the instructional program (either no rule of 
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TABLE 1 


EXPERIMENTAL ParapicMs AND ASSOCIATIONS EstasiisHep 
During tHe Dirrerent Expermentan Stages 


Stages of instruction Condition 1—No rules Condition 2—Rule early Condition 3—Rule late 
aS |—_——_—______ 
1—Pretest and 
Hcy 37 instruction Pretest Pretest ROS i Pretest : 

Basic principle of sig- | Basic principle of sig- | Basic Principle of sig- 
nificant figures in nificant figures in nificant figures in 
multiplication multiplication multiplication 

iations estab- 
aerthod ‘ M- Pr M- Pr M-— Pr 
Stage 2—Introduction | Practice Rule given for multi- Practice 
to rule of thumb and plication 
practice segment M-> Pr M> Ru M-— Pr 
Practice Rule given for multi- 
Plication 
M- Pr M-— Ru 
M-— Ru 
Stage 3—Measure trans- | Multiplication posttest Multiplication posttest Multiplication posttest 
fer and multiplication Transfer Posttest Transfer posttest Transfer posttest 
posttest Addition Addition Addition 


Trigonometry Trigonometry Trigonometry 
Note.—Abbreviated: M = multiplication Problems; Pr = general Principle; Ru = rule of thumb. 


thumb, rule early, or rule late) depending on the 
experimental group to which he had been as- 
signed. The seating arrangement in the room 
was staggered to prevent communication, Follow- 
ing completion of the instructional Program each 
8S was given the posttest containing the multiplica- 
tion and transfer problems, After completing the 
posttest, the Rokeach Dogmatism Scale was also 
administered as part of a Separate phase of the 
investigation. Following completion of this scale, 
Ss were thanked for their participation and dis- 
missed. Approximately 2 weeks later, the nature 
or purpose of the experiment was explained to 


Rasunts 


Results were analyzed by means of one- 
way classification analysis of variance and 
Scheffé tests of individual comparisons. As 
@ precaution against the Possible 
violations of assumptions underlying the 
analysis of variance, 9 
one-way analysis of variance bh; 

(Siegel, 1956) was also employed, Line 

Means and 
transfer tests under each condition are 
2. Overall 7 tests were 
statistically significant for both the addi- 
tion and trigonometry transfer scores (F = 
12, df = 2/76, p < .001 and F = 3.5, df = 
2/76, p < .05, respectively), The mean of 
4.5 achieved by the no-rule group on the 


addition problems, and the mean of 16 
achieved by the same group on the trig- 
onometry problems indicates that the in- 
structional program in the basic reasoning 
of significant figures was successful in pro- 
ducing considerable transfer to problems 
which were not specifically taught in the 
Program. Highteen of the 26 Ss in the no- 
Tule group achieved perfect scores of five 
On the addition transfer test. A Scheffé 
comparison of the mean addition transfer 
Scores of the two rule-of-thumb groups 
with the no-rule group indicates that the 
inclusion of the rule of thumb in the in- 
structional program produced a considera- 
ble decrement in transfer. The mean addi- 
tion transfer score of the no-rule group 
Was significantly greater than the means 
of the rule-early and rule-late groups (p 
values were less than .05 and .01, re- 
spectively). Whereas 18 of 26 Ss in the no- 
Tule group achieved perfect scores on the 
addition transfer test, only 20 of 53 Ss in 
the two rule-of-thumb groups achieved per- 
ct scores, 

The detrimental effect of the rule of 

umb appeared somewhat weaker in the 
case of the trigonometry subtest. The 
Scheffé comparisons indicated that the n0- 


> 
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rule versus rule-late comparison was the 
only comparison which approached sta- 
tistical significance at less than the .10 
level, although the other comparisons were 
consistent in direction with the differ- 
ences obtained for the addition test. The 
reduced statistical significance in the anal- 
ysis of the trigonometry scores as com- 
pared to the addition scores may have been 
due to the smaller number of items and the 
resultant lower reliability of the trigonom- 
etry subtest. 

The mean scores on the multiplication 
posttest (less one item which could not be 
solved by the rule-of-thumb method) were 
2.9, 3.0, and 2.9 for the no-rule, rule-early, 
and rule-late groups, respectively. Thus, 
the differences in transfer of the three 
groups could not be ascribed to differences 
in achievement on the multiplication 
problems directly taught in the program. 

The second question of interest in the 
study concerned the effects of the rule- 
practice sequence on transfer. It was hy- 
pothesized that an experimental condition 
in which a rule was given before a practice 
segment would have a more detrimental ef- 
fect on transfer than a condition in which 
the rule of thumb was given after practice. 
It was expected that giving the rule of 
thumb late in the program would provide 
additional practice with the more general 
principle, thus reducing the amount of in- 
terference from the rule. This hypothesis 
was not confirmed. In fact, the only differ- 
ence between the rule-early and rule-late 
groups which approached significance at 
less than the .10 level was in the direction 
opposite to that which was predicted. This 
unexpected finding may have resulted from 
the relatively short length of the practice 
segment (due:to a limit on Ss’ time only 
three problems were employed in the 
practice segment) or from a recency effect 
of giving the rule of thumb just prior to 
the transfer test which may have increased 
its saliency during the test situation. 

Since a large number of Ss in the ex- 
periment achieved perfect scores on the 
transfer tests, the distributions of per- 
formance were generally negatively 
skewed. As a precaution against the pos- 
sible effects of violations of the assump- 


TABLE 2 
Descriptive STaTistics FoR THE THREE 
ExPERIMENTAL Groups on THE Two 
TransFrEeR TESTS 


Condition 1 consis 2 | Condition 3 
Peehnter no-rule early rule-late 
M | SD| M | SD | M | SD 
Addition (5 
problems) 4.5| .9 | 8.5] 1.6 | 2.6 | 1.7 
Trigonometry 


(8 problems) | 1.6 | .7 | 1.2 | 1.0 1.0} 1.0 


Note. —In Condition 1, N = 26; in Condition 
2; N = 27; in Condition 3, N = 27 


tions of normality of distributions, Krus- 
kal-Wallis one-way analyses of variance 
by ranks were also computed (Siegel, 
1956). The results of this analysis were 
consistent with the results of the para- 
metric analysis of variance (for addition 
x2 = 283, p < .001, df = 2; for trigo- 
nometry x” = 292, p < 001, df = 2). 


Discussion 


Although the present findings should be 
replicated and extended to other subject 
matters and other rules, the results sug- 
gest that teaching rule of thumb of 
limited generality in an instructional pro- 
gram designed to facilitate transfer by 
means of a more general principle may 
produce considerable interference. Since 
many subject matters include a number of 
rules and principles varying in generality, 
teaching methods may have to provide op- 
portunities to reduce the amount of inter- 
ference in such situations. 

Several tentative explanations may be 
offered to account for the results obtained. 
Schulz (1960) and Gagné (1964) have sug- 
gested that the same principles which ac- 
count for simpler forms of associative 
learning may also occur in more complex 
forms such as principle learning or problem 
solving. Following this line of argument, it 
is interesting to ask whether the transfer 
paradigm employed in the present experi- 
ment in any way resembles the classical 
negative transfer paradigms employed in 
experiments on verbal learning? The para- 
digms of the present investigation are out- 
lined in Table 1. 
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One assumes that the didactic verbal warn- 
ings which were Provided to warn Ss about 
the exceptions to the rule were ineffective 


able to determine the appropriateness of 
the aipeinaae solution during the 
test, 


td’ for een the 
multiplication, addition, and trigonom: 


these three Classes of 
outward appearances 
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and the problems to which the Tule ig ip. 
applicable, 

Tt is interesting that the present Para- 
digm, although not entirely parallel, bearg 
some resemblance to a retroactive inhibj. 
tion paradigm in which two competing re. 
Sponses are associated with the same or 
similar stimuli. One of the authors ig con- 
ducting follow-up research to determine 
whether the detrimental effect of the rule 
of thumb is simply the result of over- 
generalization of its use, or whether other 
factors are involved. A tentative analysis 
of Ss’ errors on the transfer tests suggests 
that at least part of the effect is attribut- 
able to overgeneralizing the rule of thumb 
or rule misuse. 

The effect of the rule of thumb appeared 
similar to the effect of a strong persistent 
set which seems to “blind” Ss to alterna- 
tive solutions as reported in several claggi- 


cal experiments on set by Rees and Israel, 


(1935) and Luchins (1942), This ap- 
parently strong set to use the rule of thumb 
rather than alternative solution methods 
may result in part from its simplicity, ease 
of recall, and ease of application. It is also 
Possible that the typical college student 


has been strongly preconditioned to use 


such rules through years of previous in- 
struction which has emphasized rule-of- 
thumb solutions, 

The present results also resemble the re- 
sults of Wertheimer’s (1959) work on 
teaching children to find the area of a 
Parallelogram, and Katona’s (1940) 
match-stick problems, Both of these in- 
Vestigators found that a condition in which 
Ss were taught a general principle facili- 
tated transfer when compared to a condi- 
tion in which a less general principle was 
taught or where Ss simply practiced prob- 
lems. In Wertheimer’s study, children who 
Were taught to find the area of a parallelo- 
Bram by a specific solution could not 
transfer to problems in which the parallelo- 
sram was placed in a different position, or 

different geometric figures. Furthermore, 
when the children were presented with the 
new problems, they would often attempt to 

dly apply the old inadequate method. 

Owever, there is one important differ- 
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ence in procedure between the Wertheimer 
and Katona experiments and the present 
experiment. In these earlier experiments, 
Ss were taught either by the method 
utilizing the general principle or by the 
specific solution method. It is not surpris- 
ing that the methods produced differences 
in transfer since one group was never 
taught the relevant general principle. In 


_ the present experiment, Ss were taught 


both a method based on a general principle 
and a method based on the rule of thumb. 
The fact that the rule of thumb interfered 
with transfer even when an appropriate 
alternative solution was available to Ss 
emphasizes even more strongly the dan- 


gers of negative transfer which exist when 


such rules are available. 

It might be argued that the verbal 
warnings which were given to Ss were 
simply not strong enough to call their at- 
tention to the limitations of the rule. This 
possibility must be recognized. The writers 
would agree that other warnings might 
have been included which might have had 
the desired effect on Ss’ behavior. The 
warnings used in the present study were 
selected a priori by the writers, after con- 
siderable debate concerning their relative 
strength. Unfortunately, it was not possible 
in the present study to determine pre- 
cisely the effect of the verbal warnings, 
since a rule-of-thumb group without the 
warnings was not included. However, it 
seems likely that didactic statements 
alone may not be very effective in estab- 
lishing the discriminations needed to ap- 
by different principles to different prob- 
ems. 
_ Rules and principles may be @ great aid 
in problem solving; however, the present 
study suggests that teaching rules with 
only limited generality may actually in- 
terfere with transfer. If these findings are 
replicated in other problem solving situa- 
tions, one would either recommend that 
rules of limited generality not be taught 
where one has the alternative of teaching 
a more general principle, or that the in- 
struction include opportunities for the stu- 


dent to learn when to use each of the al- 
ternative methods. It is possible that the 
additional instructional time required ‘to 
teach the student to recognize the excep- 
tions to a rule of thumb will offset the in- 
creased problem-solving efficiency in the 
limited class of problems to which the rule 
of thumb applies. 
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EFFECTS OF EFFORT ON RETENTION AND ENJOYMENT! 


RUBY YARYAN BUENZ ano IRVING R. MERRILL 
University of California, San Francisco 


Dissonance theory predictions concerning the effects of effort on re- 
tention and enjoyment were tested in a human learning experiment. 


60 Ss 


For years the variable of effort has had 
a confused status in learning theory 
(Lewis, 1965). The concept of effort has 
referred broadly to the magnitude of 
energy expended in responding to a learn- 
ing situation, It emphasizes the sheer 
amount of energy expended in the entire 
situation surrounding the learning rather 
than on the Specific repetition of a par- 
ticular 8-R Sequence to be learned. In this 
difference involving the Magnitude and 
direction of activity, effort can be dis- 
tinguished from the other learning yari- 


; extinction, 
Talse serious questions for traditional views, 
since most learning theories fai] to pre- 
diet the effect of effort on resistance to 
extinction (Lawrence & Festinger, 1962). 
As a solution to this theoretical im- 
passe, the role of effort during learning 
and extinction has been explained in the 
context of dissonance theory (Festinger, 


1961; Lawrence & Festinger, 1962). Ac- 
cording to dissonance theory, the greater 
the effort that an organism is required 
to exert during the training situation, the 
more the organism resists extinction of 
the learned behavior, Effort is considered 
to be a deterrent to action, creating dis- 
Sonance in the organism as it is induced 
to engage in effortful activity. Dissonance 
is reduced in the organism by developing 
“extra attractions” for the activity sur- 
Tounding the learning or its consequences, 
Experiments using rats have demonstrated 
that increased effort leads to greater re- 
sistance to extinction regardless of the 
Teward schedule employed during train- 
ing. In these studies the development of 
“extra attractions” has been inferred to 
explain the phenomenon of greater re- 
sistance to extinction (Lawrence & Fes- 
tinger, 1962). 

The present experiment was conducted 
to test these dissonance theory predictions 
in a human learning situation. The specific 
experimental objectives were (a) to see if 

e dissonance theory predictions would 
hold for long-term retention in humans, 
and (b) to get some direct measure of 
Positive affect associated with effort that 
would lend support to the postulation of 
the development of “extra attractions.” In 

€ application of dissonance theory three 
assumptions were made, First, the learning 
Processes involved in trials to extinction 
in animals are analogous to measures of 
retention in humans, for both measures 
deal with how long in time a given learned 
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behavior is maintained. This assumption 
seemed reasonable not only because of the 
temporal similarities of the two measures, 
but also because the present experiment 
was designed to simulate in a human 
setting the acquisition procedures used in 
extinction experiments with animals. Spe- 
cifically, the material to be learned was 
presented in small segments, each segment 
requiring an effortful response. Inducing 
the participants to engage in repeated 
dissonant experiences during acquisition 
trials constituted the experimental manip- 
ulation, The second assumption was that 
human learning of abstract concepts in- 
volves principles that parallel (in a broad 
sense) the acquisition of simple perceptual- 
motor discriminations in animals. Third, 
it was assumed that the development of 
“extra attractions” is experienced by hu- 
mans as enjoyment. 

The hypotheses tested in. the present 
experiment are as follows: 

1. When people are required to exert a 
great deal of effort while they are learn- 
ing, they will remember the learned ma- 
terial longer than people required to engage 
in little effort during the same learning 
situation. 

2. Those people who exert a great deal 
of effort while they are learning will enjoy 
the learning conditions more than people 
who are required to exert little effort 
while learning the same material. 


MetHop 


A 2 X 2 design was used in the experiment. All 
subjects (Ss) were shown identical films. Inter- 
spersed throughout the films at periodic intervals 
Were written questions requiring @ one-word an- 
swer. Two groups of Ss, each tested initially at 8 
different time, were required to write the one-wo 
answer to each question plus a short essay justify- 
ing each answer; these Ss comprised the High- 
Effort Groups. Two Low-Effort Groups of Ss, each 
tested initially at a different time, viewed jdentical 
materials, and were required to write only the one- 
word answer to each question. After completion of 
the experiment all Ss were given & test of initial 
learning, and were asked several attitude questions. 
Several weeks later all the Ss were gathered to- 
gether and given a retention test. The major de- 
pendent variables were (a) how much Ss remem- 
bered from the initial to the retention tests, and 
(b) how much Ss liked the experiment after they 


had completed the initial learning task. The de- 
tails of the procedure follow. 


Subjects 


The Ss were 66 female students enrolled in a 
course in nursing. They were randomly assigned in 
equal numbers to the High- and Low-Effort 
Groups. Six Ss did not complete the experiment; 
the analysis was based upon 31 Ss in the High- 
Effort and 29 Ss in the Low-Effort Groups. Within 
each group, 17 Ss were retested after a 59-day test 
interval, and the remaining Ss were tested after a 
40-day test interval. 


Materials 


The stimulus materials were two identical 56- 
minute films (television kinescope recordings) en- 
titled Recent Advances in Nursing Care of the 
Aged: Care of the Cerebral Vascular Accident Pa- 
tient. Each film included 20 sentence-completion- 
type questions, which appeared at about 3-minute 
intervals during the presentation. 

Printed answer sheets were prepared to accom- 
pany the film. All sheets repeated each question 
in full and provided space for the appropriate writ- 
ten answer. For the Low-Effort Groups the ques- 
tions were closely spaced; for the High-Effort 
Groups each page of the form contained only two 
questions, allowing space for a short essay after 
each question. 

Most of the film questions jnvolved factual an- 
swers that were difficult to dispute in an essay of 
justification. A typical question is exemplified by 
the following: CVA ts often caused by thrombosis, 
hemorrhage, and (answer: embolus) It 
should be noted that special attention was paid to 
the selection of questions for the essays. It was 
important to have questions in which effort could 
be manipulated without varying direct practice. 
Questions of a factual, indisputable nature were 
chosen for the essays of justification, because (a) 
the indicated answers were extremely difficut to 
enlarge upon and required considerable effort, and 
(b) the resulting essays, by necessity, contained 
information that was quite tangential to the criti- 
cal material to be learned. For example, taking the 
question above, 'S was typically forced to base her 
justification on the fact that she was instructed 
in the film as to the answer and that is why she 
answered it as she did. Such a justification does 
not involve a repetition (practice) of the critical 
S-R connection to be learned. Evidence regarding 
the effectiveness of the manipulation will be pre- 


sented below. 


Procedure 


Each group viewed identical films in separate but 
comparable rooms. Initial learning was tested im- 
mediately after the viewing sessions. The retest 


measuring the retention of all Ss was conducted at 


one time with the groups combined. : 
When Ss assembled for the jnitial learning 
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session they were informed that the purpose of the 
study was to evaluate their attitudes and reactions 
to the film, which was to be revised for broadcast 
television use. This deception was used to mini- 
mize suspicion that Ss would be tetested later. 

Before the viewing, every S was given an answer 
sheet and a printed set of instructions. For the 
Low-Effort Group the instructions were to write 
the missing word on the answer sheet during the 
15 seconds provided in the film after every ques- 
tion. Instructions to the High-Effort Group were 
to write the missing word and a short essay of 
justification for each answer; the presentation was 
interrupted after every question for about 2 min- 
utes, so that the essay could be completed before 
the presentation resumed, 

After the viewing, every § was handed a 40-item 
Tnitial Test to complete. They were told that test 
performances would be held in confidence by the 
experimenters and that course grades would not 
be affected. The High-Effort Groups took approxi- 
mately 130 minutes and the Low-Effort Groups 
took approximately 95 minutes to complete this 
phase of the experiment. 

At the retest session Ss were told they were to 
complete a second evaluation of the film. They 
were given a 20-item Retention Test with the 
same instructions used for the previous 40-item 
test. When all $s completed the Retention Test, 
the true nature of the experiment was explained 


in detail. 
Measurement 


The 40-item Tnitial Test: consisted of a random ~ 


questions that were related 
to the film questions, 3 attitude i 


content of the film, Each was : u 
Question with five alternative a hind oka sg 


perception of effort (with score values in 
parentheses): “How did you find this en- 
tire experiment now that you have com- 
pleted it? Very effortful (5), Moderately 
effortful (4), Somewhat effortful (3), 
Slightly effortful (2), Not effortful at all 
(1).” 

In the analysis of variance on Perceived 
effort, only Effort Groups proved to be 
significant. The means for the High- and 
Low-Effort Groups were 3.00 and 2,24, 
respectively, indicating that Ss in the 
High-Effort Groups perceived that they 
had exerted significantly more effort during 
the experiment than Ss in the Low-Effort 
Group (F = 5.40, df = 1/55, p < 05). 

The Ss were also asked a question about 
their perception of work while they were 
viewing the films. When the two questions 
were combined into a “work-effort’ index, 
parallel findings were obtained with a be. 
tween-groups F ratio of 4.88 (p < .05). 
Correlations, however, between scores on 
the two questions were only r = .38 and 
42 for the High- and Low-Effort Groups, 
respectively, Examination of the questions 
indicated that the wording of the ques- 
tion on work was ambiguous. Therefore, 
further analyses of perceived effort are re- 
Ported using only the question on effort as 
an indicator, 

On the sentence completion task (per- 
formed during the presentation of the 
films), the groups were very comparable 
with the mean number of correct responses 
for the Low-Effort Groups (16.14) slightly 
exceeding the mean for the High-Effort 
Groups (15.97). . 

The reliability coefficient for the in- 
ternal consistency of the Initial Learning 
Test was r = 48, which was considerably 
lower than that for hospital staff nurses. 

On the Initial Learning Test, as in the 
Sentence completion task, the Low-Effort 
Groups _ scored slightly higher than the 
High-Effort Groups (Table 1). Analysis of 
variance on Initial Learning scores indi- 
cated that the Groups did not differ sig- 
Rificantly in this respect (F = .29). The 
importance of these trends will be discussed 
in greater detail below. 

Four control questions on (a) how well 


oe 
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Ss could see the films, (6) how well Ss 
could hear the films, (c) how helpful the 
questions were to Ss and, (d) how distract- 
ing the questions were to Ss indicated that 
the groups were very comparable. These 
questions were included to see if any spuri- 
ous situational factors could account for 
any of the learning effects. 

Differences in retention related to effort 
expended and test interval used were 
analyzed by a two-way analysis of vari- 
ance for unequal frequencies in subclasses 
(Walker & Lev, 1953). The means for 
the Retention Test scores and the Reten- 
tion change scores are presented in Table 
1, The analysis of variance for Retention 
change scores showed no significant source 
of variance due to Effort Groups (F = 
2.42), Test Intervals (F = 31), or the 
interaction between them (F = .38). 

To examine if effort was related to 
retention, correlation coefficients were com- 
puted between the retention change scores 
and the perception of effort ratings. Be- 
cause the Test Interval and interaction 
effects were so small, the 59- and 40-day 
intervals were combined in each of the 
efforts groupings for this analysis. For the 
High-Effort Groups the correlation be- 
tween perceived effort and retention was 
28 (p < .05), whereas for the Low-Effort 
Groups perceived effort and retention were 
negatively related with a correlation of 
—.11 (ns). Following the hypothesis per- 
ceived effort was related to retention for 
the High-Effort Groups only. 

Considering the analysis of variance 
and the correlations between retention and 
effort, the results provide positive but in- 
conclusive evidence regarding the first 
hypothesis that people who exert & great 
deal of effort while learning remember the 
material better than people who engage 
in little effort during the same learning 
situation, 

In order to test directly the second 
hypothesis, evaluating the effects of effort 
on enjoyment, all Ss were asked the follow- 
ing question on perceived enjoyment (with 
score values in parentheses) : “Now that 
you have completed the iment, what 
did you think of it? Enjoyed it very much 
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TABLE 1 
Means For Inrmat Larnine Tust Scorzs, 
Rerention Test ScoRES, AND RETENTION 
CuancE ScorEs 


Initial |Retention [Retention 
test test change 


Groups N Be 
scores* | scores* | scores 
High effort 
interval 16.65 15.82 82 
Groupe total) BS | dbs | cos 
Low effort 1 i & ae 
int 16, 15.35 1.21 
40-day interval 16.00 14.67 13 
(Groups total) (16.36) | (15.07) 4.81) 
$ Based on total correct score for each subject, 
Based on difference between Initial and Retention Test 


scores for each subject. Smaller means indicate less change or 


greater retention. 


(5), Enjoyed it moderately (4), Enjoyed 
it somewhat (3), Enjoyed it slightly (2), 
Did not enjoy it al all (1) ut 

Results of the analysis of variance for 
Perceived Enjoyment showed only Effort 
Groups to be significant (F = 4.29, df 
= 1/54, p < 05). The means for the 
High- and Low-Effort Groups were 2.19 
and 1.70, respectively, indicating that the 
High-Effort Groups enjoyed the learning 
situation significantly more than the 
Low-Effort Groups. 

To assess the relationship between effort 
and enjoyment, correlation coefficients 
were computed for the effort groupings. For 
the High-Effort Groups the relationship 
between the perception of effort and en- 
joyment was significant with an 7 of .27 
(p = .05). For the Low-Effort Groups, 
however, enjoyment was not related to 
perceived effort (r = —.08, ns). 

Following the prediction, perceived effort 
was related to enjoyment for the High- 
Effort Groups only. The results confirm the 
hypothesis that people who exert a great 
deal of effort while they are learning will 
enjoy the learning situation more than 
people who are required to exert little 
effort while learning the same material. 


Discussion 


The findings on enjoyment provide 
direct support for the hypothesis that, dis- 
sonance, created by effort, is reduced in 
the organism by developing “extra attrac- 
tions” for the learning conditions. This 
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experiment differs in two ways from an 
experiment by Aronson (1961), who stud- 
ied the effect of effort on attractiveness. 
First, in Aronson’s study, effort was con- 
founded with reward, and the results 
were interpreted in terms of the interac- 
tion between dissonance and secondary re- 
inforcement (a clear, unambiguous dis- 
sonance effect was not demonstrated). 
In this study reinforcement was held 
constant and enjoyment followed the dis- 
sonance-theory prediction. Second, Aron- 
son’s study concerned the expenditure of 
effort in obtaining colored containers, a 
simple, repetitive motor task. The present 
study was concerned with the expenditure 
of effort during the learning of unfamiliar 
material, demonstrating the effect of 
dissonance on enjoyment as it applies to 
an area of human learning. 

The results on Initial Learning indicate 
that the manipulation was effective in 
producing effort that was irrelevant to the 
specific material to be learned. If the 
manipulation of effort had involved rele- 
vant practice, there should have been 
discernible improvement in initial-learn- 
Ing scores of the High-Effort Groups, but 
this did not occur. The initial-learning 
mean for the Low-Effort Groups was 
slightly greater than the mean for the 
High-Effort Groups. The same trend of 
the Low-Effort Groups scoring higher than 
the High-Effort ae occurred with the 
mean scores on the sentence-com letion 
task, : ea ee had fakeres 
opposite trends would have b. 
to occur. ae 

The results on retention, although i 
the predicted direction, are nwa 
equivocal. It appears that the compara- 
tively low reliability of the test: measurin, 
th 2 
he content learned from the films very 
likely increased the error variance and 


reduced the significance (see Walker & 
Lev 1953, p. 306). If improved test. re- 
liability would reduce the error term 
slightly, the first hypothesis that greater 
effort produces greater retention could be 
accepted. 

The major implications of the experi- 
ment for educational practice would center 
on the activity or effortfulness of a 
student’s response to educational materials 
(for example, materials including some of 
the elements of programmed instruction, 
especially those aspects dealing with the 
presentation of information in small seg- 
ments followed by an effortful response), 
According to dissonance theory the more 
effort that the student is required to ex- 
pend in the learning situation, the greater 
should be his enjoyment of the learning 
conditions and the greater should be his 
retention of the learned material. 

In summary, the retention results can 
be considered suggestive, but cannot be 
considered a completely unequivocal test 
of the first hypothesis. The hypothesis con- 
cerning the effect of dissonance on enjoy- 
ment as it occurs in an effortful learning 
situation, however, was clearly confirmed 
in the experiment. 
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TEST ANXIETY AND FEEDBACK IN 
PROGRAMMED INSTRUCTION’ 


PEGGIE L. CAMPEAU 
Los Gatos, California® 


The study evaluated effects upon criterion performance of S's test- 
i d the presence or absence of feedback in programmed 
and 44 girls in the 5th grade scoring at high-anx- 
jous (HA) and low-anxious (LA) ends of the distribution on the Test 
Anxiety Scale Children (TASC) were i 
and no-feedback (NO-FB) versions of a programmed lesson, yielding 
a2 X 2 factorial 
tion was obtaine: 
the FB condition (p < 05) 
NO-FB condition (p < 05). No significant effe 
boys. The results suggest 

i differences in test- 
e effects of anxiety 


Sarason & Mandler, 1952; Sarason et al., 
1960). In a comprehensive review of find- 
ings related to the negative anxiety-ability 
relationship, Sarason et al, (1960) noted 
that even in studies where intelligence was 
controlled, there still were differences in 
learning rate between FIA and LA groups. 
Most of the anxiety research cited above 
employed an anxiety model which assumes 
that the elicitation and effects of anxiety 
characteristics of the 


ding Mandler & Sar- 
ason (1952), Nicholson (1958), and a Yale 
team led by S. Sarason (Sarason, David- 
son, Lighthall, Waite, & Ruebush, 1960) 
evaluated the effects of testlike conditions 
on the performance of high-anxious (HA) 
and low-anxious (LA) subjects (Ss) on 
conceptual learning tasks. The most con- 
sistent finding was that under the stress of 
testlike conditions, the performance of HA 


Investigators, inclu 


8s was disrupted to @ significantly greater 
degree than that of LA ‘Ss. On the other 
hand, the direction of this difference was 
reversed when Ss were able to see whether 
or not they were improving from trial to 
trial (Mandler & Sarason, 1952), or when 
they received reassurance instructions in 
the nature of feedback (Sarason, 1957, 
1958). 

Reports of many investigators indicate 
negative correlations between measures of 
anxiety and scores on intelligence tests and 
college aptitude batteries (Feldhusen & 
Klausmeir, 1962; Hafner & Kaplan, 1959; 
McCandless & Castaneda, 1956; Phillips, 
King, & McGuire, 1959; Sarason, 1961a; 
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Education, Peggie L, Campeau, Principal Investi- 
gator, The experimental materials were developed 
by the investigator under a separate project (Con- 
tract No. OE 3-16-036) conducted by the Institutes 
for the United States Office of Education. 
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depend on threat 
task situation and on S’s anxiety level. 
The model stipulates that when anxiety 
has been learned as a response to situa- 
tions involving intellectual achievement, 
anxiety feelings produced by comparable 
situations will elicit two types of re- 
sponses: responses which are relevant to 
the task and lead to task completion, and 
responses which are task-irrelevant and in- 
terfere with task completion. One implica- 
tion of the model is that in HA Ss, 
threat. produces more task-irrelevant re- 
sponses than task-relevant responses, and 
thereby disrupts performance. On the 
other hand, removing threat from the situa- 
tion in which HA Ss must perform reduces 
task-irrelevant responses by reducing anx- 
iety and thereby facilitates performance. 
The suggestion for the present study was 
that different levels of anxiety may de- 
termine how insecure the learning situation 
can become without disrupting perform- 
ance. Feedback was expected to be a vari- 
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able in programmed instruction style very 
likely to interact with anxiety level on the 
grounds that if feedback were omitted, 
testlike aspects of the program would be 
accentuated. Feedback (FB) and no-feed- 
back (NO-FB) conditions were expected 
to have quite different effects on HA Ss as 
compared to LA Ss, Hypotheses and re- 
lated theoretical considerations derived 
from the anxiety model and the research 
cited were as follows: 

Hypothesis I, HA Ss would perform sig- 
nificantly better on a criterion test over 
the programmed lesson if they learned un- 
der the FB condition, as compared to the 
NO-FB condition. Feedback was expected 
to produce superior achievement for HA 
Ss based on two assumptions: (a) provid- 
ing feedback would constitute a low- 
threat condition which would minimize 
task-irrelevant responses to anxiety by re- 
ducing that anxiety, and (b) task-relevant 
Tesponses would be reinforced by provid- 
ing feedback as confirmation, 

Hypothesis II. HA Ss would perform 
significantly better than LA Ss on the cri- 
terion test, when both groups learned under 
the FB condition, Consistent with anxiety 
research already cited, providing answers to 
program frames in the FB condition Was ex- 
pected to reduce testlike aspects of the task 
so that criterion performance would be en- 
hanced more for HA 8s than for LA Ss, 

Hypothesis III. When both groups 
learned under the NO-FB condition, LA 
Ss would criterion 


‘ing their performance. On 
LA Ss by definition are less 
ceive testlike situations 
that their performance 
to be disrupted. 
Hypothesis IV. 
in the FB condition would not differ sig- 
nificantly from LA Ss in the NO-FB con- 
dition. The 


finding for LA treatment, groups 


likely to per- 
as threatening, so 
was not expected 
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anxiety to be manipulated, LA §s would 
be less susceptible both to the facilitating 
effects of providing feedback and to the 
disrupting effects of withholding it, 


Meruop 


Subjects 


The Ss were fifth graders at two elementary 
schools in the San Francisco bay area who scored 
at the high and low ends of the Test Anxiety 
Scale for Children (TASC). 


Materials 


A programmed instruction lesson on earth-sun 
Telationships was prepared in two versions: one 
with feedback, one without. In all other respects, 
the two versions of the 193-frame program were 
identical. A pretest and posttest were prepared 
by random selection from a pool of 116 test items 
covering the content of the programmed lesson, 
The pretest contained 25 items; the posttest, 45 
items. 


Procedure and Design 


The TASC, developed by Sarason et al. (1960), 
was administered to all fifth grade students at the 
two schools, Cut-off points for HA and LA groups 
corresponded approximately to the upper and 
lower 27% of the TASC distribution for each sex, 
yielding samples of 36 boys and 44 girls for ex- 
perimental analysis, For HA girls, TASC scores 
ranged from 29 to 17; for LA girls, from 8 to 0. 
For HA boys, TASC scores ranged from 26 to 16; 
for LA boys, from 5 to 0. Samples were iden- 
tified Separately for girls and boys because pre- 
Vious research indicated sex to be a significant 
variable in experiments dealing with anxiety 
(Lunneborg, 1964; McCoy, 1963; Phillips, 1962; 
Sarason, 1961b, 1963; Sarason et al., 1960). 

Nine days before the learning program was ad- 
ministered, scores on the pretest were obtained to 
Provide measures of prior knowledge of the topic. 

HA and LA Ss were assigned randomly to FB 
and NO-FB versions of the program in a 2 X 2 
factorial design for each sex. To avoid giving the 
impression that certain students were being singled 
out for the experiment, even those individuals 
falling between the cut-off points in the TASC 
distribution were given one or the other version 
of the program to complete. The Ss progressed 
through the program at their own rates. Although 
mean time to complete the program was about 3 
hours, 1% school days were set aside for the ex- 
Periment to allow $s to take normal breaks for 
Tecess, lunch, and physical education, just as when 
working on class assignments. 

The posttest was given to each S as soon as he 
completed his program. All Ss had time to com- 
plete the test. 

The same test was readministered 19 days later 
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to obtain ® measure of delayed retention. Again, 
all Ss were allowed to complete the test. 

To assure that instructions and procedures for 
all experimental measures and materials were 
equivalent at the two schools, (a) for every ex- 
perimental session, Ss assembled in a room large 
enough to accommodate the school’s entire fifth- 
grade enrollment with adequate working space; 
(b) all experimental sessions were well monitored 
(approximately one monitor for every 12 Ss); (c) 
the investigator always served as experimenter, 
assisted by the same monitors in all experimental 
sessions (teachers were never present), and (d) 
the same instructions were supplied to all Ss either 
in written form or by the experimenter through 
announcements from & script. 


RESULTS 


Immediate Retention 


A gain score (immediate retention test 
score minus pretest score) was calculated 
for each S. Because of the negative rela- 
tionship between anxiety level and ability 
demonstrated in studies cited earlier, 
these gain scores were adjusted by covari- 
ance to control for differences in IQ as 
measured by the California Test of Mental 
Maturity (CTMM). Adjusted mean gain 
scores for 8s in the different experimental 
treatments are given in Table 1. Scores for 
boys and for girls are shown separately. 

For girls, analysis of covariance for ad- 


-justed gain scores yielded a significant 


Anxiety x Feedback interaction (F = 
5.61, df = 1/39, p < .025). No significant 
effects were found for boys. Differences 
among adjusted means for girls’ data were 
in the hypothesized directions. The signifi- 
cance of these differences was assessed 


TABLE 1 
Avyustep Mzan Gatn Scorns on IMMEDIATE 
Rerention Tust 


Program version 
Subjects 
Feedback No feedback 
ureters 2S 

Girls 

High anxiety 25.85 16.47 

Low anxiety 19.42 23.53 
Boys 

High anxiety 17.14 18.12 

Low anxiety 18.02 19.83 


Sea ety ha ieg anne ane cma 
Note—For girls, N = 11 in each cell; for boys, 
N = 9 in each cell. 
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TABLE 2 
Apsustep Mzan Gatn Scores on 
Dztayep Rerention Test 
Program version 
Subjects a 
Feedback No feedback 
Deanpyci i Sek ee 
Girls 
High anxiety 23.09 14.85 
Low anxiety 19.17 23.69 
Boys 
High anxiety 21.72 20.32 
Low anxiety 17.27 20.27 


a eases sD ad Bh eens 
Note.—One or two Ss were lost from some cells 
due to absences from the delayed retention test 
session. For girls, High-anxiety feedback, N=10, 
High-anxiety no-feedback, N = 11, Low-anxiety 
feedback, N = 9, Low-anxiety no-feedback, N = 
9; for boys, High-anxiety feedback, N = 8, High- 
anxiety no-feedback, N = 9, Low-anxiety feed- 
back, N = 7, Low-anxiety no-feedback, N= 9. 


by the critical difference method (Lind- 
quist, 1953). For HA girls, the FB condi- 
tion yielded significantly better perform- 
ance than the NO-FB condition (p < .05). 
No other differences were large enough to 
achieve significance. 


Delayed Retention 

Analyses similar to those made for the 
immediate retention test scores were per- 
formed on the data from the delayed reten- 
tion test. The adjusted mean gain scores 
for delayed retention are presented in Ta- 
ble 2. 

For girls, the Anxiety Feedback inter- 
action was significant (F = 5.86, dj = 
1/35, p < .025); again, no significant in- 
teraction was obtained for boys. HA girls 
under the FB condition surpassed the 
performance of HA girls under the NO-FB 

iti and LA girls sur- 
passed the performance of HA girls under 
the NO-FB condition (p < .05). 


Discussion 


Evaluation of the Experimental H ‘ypotheses 


For girls, three of the four hypotheses 
were supported: (a) FIA Ss did better un- 
der the FB condition than under the NO- 
FB condition; (b) LA 8s did better than 
HA Ss under the NO-FB condition, and 
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(c) LA Ss did as well under the FB condi- 
tion as they did under the NO-FB condi- 
tion. The expectation that HA Ss would do 
significantly better than LA Ss under the 
FB condition was not supported. The re- 
sults suggest that programming procedures 
be adapted to individual differences in test- 
anxiety level and sex. In this way, the neg- 
ative effects of anxiety on learning could be 
reduced and the positive effects capitalized 
upon. 

The contrasting findings for girls and 
boys deserve further comment. Evidence of 
sex differences in previous anxiety re- 
search cited earlier indicated that predic- 
tions about the interfering effects of anx- 
iety were more clearly demonstrated for 
girls than for boys. In this study, the fact 
that girls’ data supported experimental ex- 
pectations while boys’ data failed to con- 
firm these predictions is generally consist- 
ent with earlier findings. Obviously, in 
generalizing the results of this study sex 
difference cannot be neglected. 


Feedback during Learning and Facilitated 
Test Performance 


Once the program had been completed 
and the test begun, all Ss were in a “no- 
feedback” condition in that no answers 
were provided for checking their responses 
to test items. Since HA Ss typically are at 
a disadvantage in a test situation, it 
might be expected that their performance 
would be disrupted during the criterion 
test. It is always difficult to differentiate 
between learning and performance, In this 
regard, further speculation may be in or- 
der as to why feedback during learning 
produced superior test performance for HA 
Ss. It is suggested that the FB program 
allowed Ss to practice task-relevant re- 
sponses. That is, written Tesponses were 
Tequired to literally hundreds of questions 
throughout the Program, and Ss were reas- 
sured of the adequacy of their task-rele- 
vant responses by feedback. This procedure 
might be likened to taking a series of 
practice quizzes” over program content as 
Preparation for the criterion test, For HA 
Ss most especially, feeling Prepared to 
answer questions about Program con- 
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tent could then have functioned to Te 
duce disruptive effects of anxiety during 
the retention tests. 


Suggestions for Future Research 


In view of the fact that programmed in- 
struction is finding increasing applications 
at all grade levels and in nearly all sub- 
ject matters encountered in our schools, 
findings from this study suggest some 
worthwhile problems for future investiga- 
tion. First, similar studies are needed in 
which feedback and no-feedback programs 
in different subject matters are adminis- 
tered to HA and LA learners at different 
grade levels. Data from such studies would 
permit greater generalization about the 
interaction between anxiety and feedback 
in programmed instruction, based on grade, 
subject-matter, and sex variables. 

Second, more elaborate research is 
needed to take into account the teacher 
variable. When programmed instruction 
is integrated with conventional classroom 
teaching, the teacher will be another source 
of influence on student performance (Gold- 
beck, Shearer, Campeau, & Willis, 1962). 
In terms of the anxiety model outlined 
earlier, the nature of the teacher’s response 
to student performance will help determine 
the degree to which classroom atmosphere 
is stressful or reassuring. The implica- 
tion for future research is that differences 
in teacher responses to student perform- 
ance should be studied in combination 
with programmed instruction variables 
(e.g., feedback) and with the learner's 
level of anxiety. 
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TOKEN REINFORCEMENT OF ACADEMIC PERFORMANCE 
WITH INSTITUTIONALIZED DELINQUENT BOyYs! 


VERNON O. TYLER, Jz. ano G. DUANE BROWN? 
Western Washington State College 


Court-committed boys ages 13-15 in a training school observed a daily 
television newscast. The following morning in school their teachers 
administered a 10-item true-false test based on program content; Ss 
were immediately shown their scores. After school, Ss were paid tokens 
tedeemable for candy, gum, etc. During Phase I (17 days), Group 1 
(9 Ss) received tokens contingent on test performance; Group 2 (6 
§s) received tokens on noncontingent (“straight salary”) basis. During 
Phase II (12 days), Group 1 received noncontingent reinforcement 
and Group 2 contingent reinforcement. Hypothesis that test scores 
would be higher under contingent than noncontingent reinforcement 
was supported in both between- (p < 05) and within-S (p < .005) 
comparisons. Conclusion was that contingent token reinforcement 


strengthens academic performance. 


Many educators prefer to motivate 
academic performance with “intrinsic” 
rather than “extrinsic” reinforcers; if used 
at all, they say, extrinsic reinforcers 
should be employed with caution (Marx, 
1960). At the same time it is recognized 
that delinquent youngsters often have 
academic difficulty in the usual school 
situation (eg. Bloch & Flynn, 1956; 
Briggs, Johnson, & Wirt, 1962). Since the 
IQs of delinquent. youngsters may aver- 
age well within the normal Tange (eg., 
Tyler & Kelly, 1962), low motivation ap- 
pears to be responsible for their poor 
school performance, 
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Various approaches have been suggested 
for motivating these “underachievers.” As 
Birnbrauer, Wolf, Kidder, and Tague 
(1965) have indicated, these include (a) 
the use of “intrinscially reinforcing” ma- 
terials which “are ‘interesting,’ ‘meaning- 
ful’,” ete, (b) “using materials and pro- 
cedures which combine interest value and 
high probabilities of success,” and finally, 
(c) “presenting social and/or symbolic 
reinforcers, e.g., teacher approval, grades, 
and stars.” But as Birnbrauer et al., point 
out, none of these methods may be ade- 
quate for the retarded, school dropouts, 
and behavior problems. They suggest 
token reinforcement systems may be more 
effective. In such systems the tokens which 
are exchangeable for tangible reinforcers 
become generalized reinforcers (Skinner, 
1953). A few examples of token reinforee- 
ment systems which have strengthened 
academic performance include studies with 
Youngsters having reading difficulties 
(Staats, Staats, Schutz, & Wolf, 1962), 
tetardates (Birnbrauer, Wolf, Kidder, & 
Tague, 1965), nursery school youngsters 
(Heid, 1964), and elementary school chil- 
dren (Michael*). However, work with de- 
Iinquent youths appears to be quite rare. 
Cohen (Cohen, Flipczak, & Bis, 1965) 
has described a promising program for in- 
stitutionalized delinquents. Of course, 
Slack (1960) and Schwitzgebel (e8 


*J. Michael, personal communication, June 16, 
965. 
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Schwitagebel & Kolb, 1964) have used 
operant techniques with delinquents, but 
not directly in the area of academic per- 
formance so far as is known. 

For the present study it was assumed 


that many delinquent youngsters lack re- 
inforced practice in the skills that result 
in teacher ratings of satisfactory perform- 
ance. Apparently, the typical school situa- 
tion does not provide the type of rein- 
forcements necessary to strengthen these 
skills. The purpose of this study was to 
develop procedures for improving the aca- 
demic functioning of a group of delinquent 
boys. This essentially involved setting up 
a “token economy” based on academic 
performance. More specifically, it was 
hypothesized that academic performance 
with contingent reinforcement will be 
superior to performance with noncontin- 
gent reinforcement in both between- and 
within-group comparisons. 


MetHop 


The subjects (Ss) in this study were 15 court- 
committed boys, 13-15 years of age who resided 
in a one-cottage living unit of a state training 
school. They attended school in their own self- 
contained classroom supervised by two team 
teachers. At 6 PM every evening, Monday through 
Friday, the television set in the cottage day room 
was turned on to the Huntley-Brinkley news 
broadcast. Youngsters were permitted, but not re- 
quired, to watch the program; the only require- 
ment was that all youngsters in the vicinity of the 
television set remain quiet so that those who 
wished to watch could do so. The following morn- 
ing in school, Ss were administered a 10-item ¢rue- 
false test on the news program. The teachers wrote 
the questions the night before while watching the 
program. They wrote a new question every time 
there was a change of subject and two or three 
items to cover special subjects presented at the 
end of the program. The items were simple state- 
ments concerning the current events presented in 
the broadcast. Of course, this method meant the 
items were not standardized for difficulty. Im- 
mediately after administration, the tests were 
graded and the scores entered on & grade sheet 
which each student carried with him. Upon return- 
ing to the cottage, in the afternoon, those Ss on 
contingent reinforcement were paid in tokens ac- 
eras to the scores they had earned on the test; 
Ss on noncontingent reinforcement were paid a 

‘Straight salary.” The tokens were redeemable for 
canteen items (candy, gum, ete.) and privileges in 
the cottage. 

The Ss were paid the tokens according to & 
schedule designed by the experimenters (Bs). The 
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TABLE 1 


DESIGN FOR ADMINISTRATION OF CURRENT Events 
Test REINFORCEMENT 


5 Phase I Phase IL 
Subjects (ays 1-17) (Days 18-29) 
fie Cr ak LDS 
Group 1 Contingent Noncontingent 
reinforcement reinforcement 
Group 2 Noncontingent | Contingent rein- 
reinforcement forcement 


Note.—For Group 1, N = 9; for Group 2, N = 6. 


Es looked at each S’s scores on the true-false test 
for the 20 school days prior to the beginning of the 
experiment. Considering these data and S's pre- 
sumed level of motivation, a judgment was made 
as to what his schedule should be to maximize test 
performance; for example, if an 8 had been aver- 

sng 6 items correct, and had been earning ap- 
proximately 20¢ a day in tokens, he would be given 
about 15¢ for 6 items correct, 20¢ for 7, 25¢ for 8, 
27¢ for 9 and 30¢ for 10 correct. The goal was to let 
each § earn his previous average “income” with & 


institution, some &: i t 
completion of the study resulting in unequal Ns in 
the two groups. 

In Phase I, Ss in Group 1 were placed on con- 


contingent 
less of how well they did on the test). In Phase i, 


counter! | nece 0 
sate for uncontrolled variability in the difficulty of 
the tests from day to day. 

‘Although Group 1 (mean age 15.6) averaged & 
year older than Group 2 (mean age 14.6), both 
functioning in the low average 


absent 


the tests were not equated for i ‘ 
to day, the problem ‘of missing data was serious 


cluded for analysis. Missing scores for each S were 
replaced with the S’s mean score for the phase. 
From Phase I, data are reported from 17 of the 27 
days on which tests were inistered; from Phase 
Tl, data are reported from 12 out of 29 days. As is 


apparent, it was ne s 
tities of data in order to make comparisons 10 
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which most of the Ss of both groups were repre- 
sented. 


Resuits 


Mean daily test scores for both groups 
for Phases I and II over the 29 days re- 
ported are presented in Figure 1. Means 
for each phase are also included. The data 


TEST SCORE (NUMBER CORRECT) 


Fa. 1. The effect of contingent 
test performance. 


show a clear pattern: during Phase I, 
Group 1 surpassed Group 2 on 15 out of 
17 days; during Phase II, Group 2 sur- 
passed Group 1 on 9 out of 12 days. Re- 
versals when they did occur were quite 
small in contrast to the predicted dif- 


TABLE 2 
Anazysis oF Vartance oF Sunyecr Mean* 
Trsr Scores unpER ContINcENT AND 
Noncontincent Remnrorcement 


——————— 


Source 


af MS FP 
Between Ss 
B (Groups) 1 .02 
Error (b) 13 43 
Within 83 
A (Phases) 1 2.60 7.43* 
AB 1 2.49 | 71186 
Error (w) 13 85 


* Analysis based on two values from each sub- 
ject: the mean of his 17 daily scores from Phase I 
and the mean of his 12 scores from Phase II. 

*p < .025; two-tailed. 

**p < 0125; one-tailed. 
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ferences between groups. The irregular, 
spiked form of the curves suggests that the 
tests varied a good deal in difficulty leye] 
from day to day as was expected, The 
nearly parallel form of the two curves in. 
dicates the groups responded to these 
variations in difficulty in a highly con- 
sistent, reliable fashion. 


PHASE 11 


1-23 4 5 6 7 B 0 10 11 12 13 14 15 16 17 18 19 20 Bi 22 23.24 25 26 a7 20 20 


pays 


and noncontingent token reinforcement on true-false 


The § means for each phase were treated 
with a Lindquist (1953) Type I design 
analysis of variance. The summary of this 
analysis in Table 2 indicates there was 
no difference between Groups 1 and 2 (B 
comparison), but that the difference be- 
tween Phases I and II (A comparison) and 
the interaction between Groups x Ene 
(AB comparison) were both significan 
(p < 025 and p < .0125, respectively). 
No difference was expected in the 
comparison because of the counterbalanc- 
ing of treatments. The difference be 
phases may be attributed to beet : 
day-to-day variability in the diffic ; 
level of the tests. The interaction effec 
indicates that under contingent reinti 
ment, performance was at a significan! y 
higher level than under noncontingent Bs 
inforcement. Since the direction of fe 
interaction effect was predicted, the pro 
bility value was halved (one-tailed test). a 

While the within-S variances 9galn 


> 
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which the interaction effect was tested 
were not significantly heterogeneous, some 
question could be raised about the nor- 
mality of distributions of within-S differ- 
ence scores. TO avoid the assumptions of 
homogeneity of variance and normality of 
distributions and to study individual S 
performance, the data were subjected to 
nonparametric treatment. Mean scores for 
each S for each phase (same data as for 
analysis of variance) were classified as to 
whether the trend in the data supports 
(+) or does not support (—) the pre- 
diction that each S will perform at a 
higher level when token reinforcement is 
contingent on his test score than when it 
ig not. Twelve of the 15 Ss did better 
under contingent reinforcement; only two 
Ss did worse. The Wilcoxon matched-pairs 
signed-ranks test (Siegel, 1956) was ap- 
plied to these data yielding a highly sig- 
nificant T of 11 (p < .005; one-tailed 
test). 

The between-groups effects were also 
tested for each phase separately, using the 
Mann-Whitney U test for independent 
measures (Siegel, 1956). As predicted, 
using one-tailed tests, during Phase 
Group 1 surpassed Group 2 (U = 18, 
approaches significance at the .05 level) ; 
during Phase II, Group 2 surpassed Group 
1(U = 12, p < 05). 


Discussion 


Both between-groups and within-groups 
data clearly indicate that contingent rein- 
forcement was associated with higher test 
performance than when reinforcement was 
noncontingent. This pattern emerged in 
spite of the use of quickly prepared un- 
standardized test items which varied con- 
siderably in difficulty from day to day and 
in spite of unstable conditions such as 
the shifting institutional population. 

Moreover, this pattern appeared and was 
maintained with consistency over a 12- 
week interval. This would suggest more 
than a transitory effect, more than de- 
linquents “playing games” with the pro- 
gram or a novelty that wore off. 

While the effect of the contingent rein- 
forcement is statistically significant, the 
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practical educational significance appears 
limited at this point. The Ss on contin- 
gent reinforcement averaged less than 
one test item better performance than 
when they were on noncontingent rein- 
forcement. On the other hand, it should be 
noted that Ss were attending small classes 
(10 students per teacher) led by teachers 
who in Hs’ judgment were about the most 
competent they had ever seen. These teach- 
ers had a knack with obstreperous young- 
sters; they knew how to discipline them 
and yet they were quite skilled and in- 
genious at devising methods of exciting the 
interests of even the most apathetic young- 
ster. Thus the token reinforcement was 
tried against the severe competition of 
undoubtedly powerful social reinforce- 
ments supplied by these teachers. 

That the reinforcement showed an effect 
in addition to what was generally re- 
garded as an effective instructional pro- 
gram is further evidence of the importance 
of tangible reinforcers with delinquent and 
disadvantaged youngsters. It is doubtful 
that the tokens would have been this 
effective in a prosperous urban junior high 
school in which the youngsters were sati- 
ated with tangibles, enjoyed school, and 
were achieving “success” in the middle- 
class culture. 

Replication of this study with more pre- 
cise controls would more clearly demon- 
strate the effectiveness of this procedure. 
These controls should include an unrein- 
forced control group and test items con- 
structed to be more nearly equal in diffi- 
culty. Previous efforts by the investigators 
to produce improved academic perform- 
ance with token reinforcement showed no 
results, presumably because of inade- 
quate controls, particularly with regard to 
the measurement of the criterion. 

Ultimately, of course, efforts must be 
made to “wean” youngsters from token 
reinforcers and link academic performance 
to the more traditional reinforcers such as 
social approval and perhaps even the 
“intrinsic” reinforcement of work “for the 
joy of the working [Kipling, 1896].” How- 
ever, the results of the present study are 
encouraging and suggest that many young- 
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sters who are uninterested and antagonis- 
tic toward school work can learn that 
school work can “pay off.” 
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RELEVANT AND IRRELEVANT VERBALIZATION IN 
DISCRIMINATION AND REVERSAL LEARNING 
. BY NORMAL AND RETARDED CHILDREN’ 


NORMAN A. MILGRAM anv JAMES 8. NOCE 
Catholic University of America 


Children age 7 and mental retardates of comparable MA were ad- 
ministered a discrimination learning and reversal shift task under 
varying conditions of relevant and irrelevant verbalization. When 
compared with a condition of no-verbalization, verbalization of the 
relevant class of cues facilitated performance, verbalization of 1 
relevant and 1 or more irrelevant dimensions was also facilitative, 
but to a lesser degree, and verbalization of irrelevant dimensions 


only interfered. These effects were more evident in discrimination 


learning than in reversal 


shift and more evident in normals than in 


retardates. Results were interpreted as supporting the notion that 


dimensions. 


verbal labels may direct attention toward or away from criterial 


The research literature on the effect of 
yerbalization on discrimination learning 
and reversal shift in children has not 
yielded definitive results. Spiker (1963) re- 
ported that learning verbal labels for 
stimuli led to more rapid acquisition of 
relevant discriminations. Spiker did not 
utilize a reversal shift paradigm, but on 
the basis of his explanatory concept of en- 
hanced cue distinctiveness one would pre- 
dict that verbal labels referring to criteria 
cues should facilitate reversal shift as well 
as the original discrimination learning. 

O'Connor and Hermelin (1959) con- 
firmed a diametrically opposite predic- 
tion for a group of trainable mental re- 
tardates, namely that verbalization of 
relevant cues actually interfered with re- 
versal shift. They argued that verbal con- 
nections in subjects (Ss) below MA 6 
whether normal or retarded would oper- 
ate in a rigid stereotyped manner so that 
verbalization of a eriterial cue, (eg., “the 
big one”) which was correct during dis- 
crimination learning would interfere with 
acquiring the new or reversed reponse (the 
smaller stimulus) during reversal shift. 


: ~ portion of this paper was submitted by the 
junior author as a master’s thesis to the Depart- 
et of Psychology and Psychiatry, Catholic 
Say, of America, 1966. The senior author 
pace as major professor on the thesis committee 
GE was Principal Investigator of National In- 

itute of Mental Health Grant MH 08488 which 
Supported the overall study. 


They acknowledged, however, that Ss of 
higher MA, whether normal or retarded, 
could utilize verbal connections with 
greater flexibility and might well profit 
from overt verbalization of relevant cues 
in solving reversal shift problems. 

Kendler, Kendler, and Wells (1960) re- 
ported that verbalization by 4-year-old 
children of relevant or irrelevant cues dur- 
ing a block of trials between original dis- 
crimination learning and reversal shift had 
no effect on the latter. In a second study 
(Kendler & Kendler, 1961), children age 4 
and 7 were trained to give relevant or ir- 
relevant responses during original dis- 
crimination learning. Results indicated that 
irrelevant verbalization significantly hin- 
dered reversal shift at both age groups, 
while relevant verbalization facilitated re- 
versal shift for the 4-year-old children 
only. These investigators attributed the 
failure of the 7-year-olds to benefit from 
overt verbalization of relevant cues to 
spontaneous covert verbalization of rele- 
vant cues, consequently overt verbalization 
of relevant cues would not make any dif- 
ference. 

It appears reasonable to assume, how- 
ever, that if the interfering effect of irrele- 
vant verbalizations can be demonstrated, 
the opposite effect should also obtain. Fail- 
ure to demonstrate verbal facilitation may 
have been due to a methodological defect 
in the Kendler paradigm. The reversal shift 
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required was relatively easy for the 7-year- 
olds and was attained almost immedi- 
ately. Children merely had to shift, from 
one cue of a brightness or size dimension to 
the other cue. If the stimuli were varied on 
more than two dimensions, discrimination 
learning and reversal shift would be more 
difficult and the facilitation of overt rele- 
vant verbalization would then be demon- 
strated. To increase task difficulty four 
pairs of stimuli were utilized in the present 
study, each pair varying simultaneously 
along two cue values of each of three di- 
mensions (size, brightness, shape) for ex- 
ample, a small black square versus a big 
white circle, 

The Kendler paradigm, moreover, ap- 
pears to have confounded reversal and non- 
reversal shift. Children were instructed to 
verbalize a cue within a dimension that 
was to become relevant or irrelevant only 
in the subsequent shift condition. Children 
in the relevant condition were trained to 
verbalize brightness and retained the 
brightness dimension with reversed cues on 
the second task. The other children were 
trained to verbalize size, which was also a 
correct guide to mastering the original dis- 
crimination task, but which became irrele- 
vant during reversal shift when only 
brightness was consistently reinforced. 
From the point of view of these Ss, the 
second task consisted of a nonreversal 
shift, that is, shifting from size to the 
brightness dimension. Viewed in this man- 
ner, the Kendler results may be interpreted 
merely as showing that children age 7 find 
reversal shift easier than nonreversal shift 
rather than as evidence for the interfering 
effects of irrelevant verbalization. 

i One way to meet this methodological ob- 
jection is to require verbalization whose 
irrelevance to the criterial dimension is evi- 
dent to the child each time he utters the 
verbalization. Let us assume that the chil- 
dren are learning to discriminate size and 
that one group has been instructed to des- 
ignate response choices by verbal labels re- 
ferring to the dimension of shape. This di- 
mension is irrelevant throughout since a 
circle stimulus and a square stimulus are 
presented on each trial and shape is rein- 
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forced on a choice basis. One would pre- 
dict that children employing verbal labels 
of the shape dimension are less likely to 
attend to the criterial size dimension than 
a base-line condition of no verbalization 
and would require more trials to reach cri- 
terion either in the original discrimination 
phase or the subsequent reversal shift 
phase. By contrast, children instructed to 
employ size designations are more likely to 
attain and to reverse the size discrimina- 
tion than Ss in the base-line condition, 

It is an open question whether employ- 
ing relevant and irrelevant verbal desig- 
nations of response choices cancels itself 
out and is not significantly different from 
the no-verbalization condition or whether 
it exercises a significant effect in either di- 
rection. The present investigators favored — 
the notion that mixed verbalization, rele- 
vant and irrelevant designations, would 
facilitate performance by directing atten- 
tion to the criterial dimension being rein- 
forced consistently. The directing of 
attention to irrelevant dimensions would 
be of less magnitude since these dimensions 
are reinforced only on a chance basis, The 
overall effect would thus be facilitative. 

The present study attempted to answer ” 
these questions in an experimental design 
comparing two groups of children of differ- 
ent IQ level, but of comparable MA. In re- 
viewing the experimental literature on the 
relative ease of discrimination learning by 
normals versus educable retardates of com- 
parable MA, Stevenson (1963) states that 
no definitive conclusion has been reached. 
The relative ease of reversal shift also 1e 
Mains an unanswered question. In accor 
with the Lewin-Kounin theory of rigidity 
or the view of Luria (1957) and Reese 
(1962) on verbal mediation deficiency, n° 
would predict an inferior performance i 
retardates, but Plenderleith (1956) am 
Stevenson and Zigler (1957) report equiva 
lent performance by retardates and n0r- 
mals of comparable MA. 


MeErTHop 
Subjects 


Normal Ss were 96 children with a mea 
7.5 years and a range from 7 years up to 


n CA of 
but not 
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TABLE 1 
Mean CA anp MA ror RETARDATES 
CA MA 

Condition 

uM SD Mu SD 
1DR 15.7 4.5 6.7 1.5 
2DR 15.6 4.0 7.2 1.2 
3DR 14.4 2.3 6.8 0.8 
Cc 16.7 4.0 7.0 1.0 
1D-IR 13.7 2.8 6.8 0.8 


including the eighth birthday. These children were 
selected from day camps located in the suburban 
Washington, D. C., area and were randomly 
assigned, 16 each to one of six experimental con- 
ditions, The retardates were 70 residents of the 
Children’s Center of the District Training School, 
Laurel, Maryland* The scarcity of suitable 
retardates restricted their participation to five 
major conditions, and they were randomly assigned 
to one of each. Mean CA and MA scores based 
on a recent Peabody Picture Vocabulary Test of 
the 14 Ss assigned to each condition are presented 
in Table 1, The mean CA ranged from 13.7 to 16.7 
and the mean MA score ranged from 6.7 to 7.2. 
Overall, there were no significant differences in 
CA or MA between retarded Ss assigned to the 
various experimental conditions. 


Stimulus Materials 


Stimuli were two-dimensional squares and cir- 
cles mounted in pairs on cardboard rectangles. The 
stimuli varied in size (1 inch and 3 inches) and 
brightness (black and white). The eight resulting 
stimuli were presented in four combinations of 
pairs such that each member of the pair differed 
from the other in all three dimensions (size, bright- 
ness, and shape), 


Procedure 


Pretraining in verbal labels. All children were 
tested individually by the second investigator. He 
explained that S was going to play a game but 
must first become acquainted with the game 
materials. The S was shown the stimuli in pairs 
and was instructed to designate all stimuli by 
one, two, or three dimension labels. Depending 
upon conditions, S employed designations of size 
nn big one,” “the little one”); brightness (“the 
2 lack one,” “the white one”); shape (“the circle,” 
the square”) or a combination of these. In the 
experimental conditions which combined relevant 
and irrelevant dimensions or utilized more than 
One irrelevant, dimension, Ss were trained to use 
_—_—- 

a 
5 The authors wish to express their deepest 
Chien to the staff and residents of the 
Tavdren’s Center, District Training School, 

‘aurel, Maryland, for their cooperation. 
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such combined designations as “the big black 
one,” “the white circle,” “the little square,” “the 
big black square,” ete, Each § was required to 
reach a criterion of once through all eight stimuli 
giving each its proper designation. In the discrimi- 
nation and reversal shift phases each S always in- 
dicated response choice between each pair of 
stimuli by employing only the designation peculiar 
to his pretraining experience. The control group 
who were not to employ verbal designations at 
all but to indicate discrimination choice by point- 
ing were exposed to the eight stimuli the average 
length of time required by the various verbal 
groups to complete the pretraining phase (2 min- 
utes). These Ss were commanded to look care- 
fully at each card, but were neither given verbal 
labels nor permitted to employ their own verbal 
designations. 

Discrimination learning. Each S was informed 
that the object of the game was to acquire a 
plastic poker chip each time he chose the correct 
of two presented stimuli. He was further told that 
after the game he would be able to trade in his 
pile of poker chips for a candy bar. The four pairs 
of stimuli were presented in a systematic alter- 
nation such that no one pair of stimuli appeared 
more than twice in succession and across all trials 
the correct cue appeared as often on the right 
as on the left, Following each trial the stimulus 
placards were raised. Tf a poker chip was found 
under the placard that was chosen, it was then 
given to S both as reward and feedback that he 
had made the correct choice. Between trials a 
vertical screen was imposed while the wells were 
baited and then covered by the next pair of 
stimuli. Criterion of successful learning was 9 out 
of 10 correct choices to a maximum of 72 trials. 

Reversal shift. These trials followed the learn- 
ing trials without any further instruction or osten- 
sible variation in procedure. The difference was 
that chip reinforeements were now given for the 
reverse of the previously reinforced dimension. 
The Ss who failed to reach criterion on the 
original discrimination were not exposed to the 
reversal condition for the reason that one cannot 
be expected to reverse @ discrimination that one 
has not acquired. Reversal shift continued until 
criterion of 9 out of 10 correct trials or a maxi- 
mum of 72 trials was reached. 


Experimental Schema 


The relevant dimensions were size and bright- 
ness, while shape was never relevant. One dimen- 
sion only. was relevant for a given S in a given 
condition, yet all three dimensions were available 
simultaneously in the presentation of any pair of 
stimuli, The experimental conditions were as fol- 
lows: (a) Verbalization of the one relevant dimen- 
sion, size or brightness (IDR). (b) Verbalization 
of one relevant and one jrrelevant dimension, the 
coupling of size or brightness with the other or 
with shape (2DR). (c) Verbalization of one rele- 
vant and two irrelevant dimensions, size or bright- 
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ness with the other and shape (3DR). (d) Verbali- 
zation of one irrelevant dimension, either size, 
brightness or shape (1D-IR). (e) Verbalization 
of two irrelevant dimensions, either size or bright- 
ness and shape (2D-IR). (f) A control or non- 
verbalization condition (C). 


RuEsvLTs 


Two types of data were analyzed, fre- 
quencies of Ss successfully reaching cri- 
terion and mean number of trials. Since 
the variances of trial scores varied widely 
between conditions these scores were rou- 
tinely subjected to a square root transfor- 
mation prior to analysis. All statistical 
tests reported were two-tailed. Frequency 
and transformed trial scores in the dis- 
crimination learning phase are presented in 
Table 2. It is noted that the distribution of 
frequencies by condition followed predic- 
tion and was nearly identical for normal 
and retarded groups. When group fre- 
quency data were collapsed, 1DR was sig- 
nificantly superior to C (x? = 7.07, p < 
01) and 1D-IR was significantly inferior 
to © (x? = 4.43, p < .05). The conditions 
combining relevant and irrelevant dimen- 
sions, 2DR and 8DR, were equivalent to C 
and were also significantly superior to 
1D-IR (and 2D-IR for normals also). 

Analysis of variance of transformed 
trials for all Ss in five experimental con- 
ditions (data on 2D-IR for normals were 
not included) yielded a highly significant 
F ratio (F = 10.99, df = 4/130, p < 01) 
for the condition effect. The array of means 
constituted a near perfect hierarchy in 


TABLE 2 
Mean Transrormep Toran Trrats or 
Discrimination LEARNING 1N 
Normats ann Rerarpares 


Normals 


Retardates 


1DR 16 | 3.79 | 0.6 
2DR 13 5:07 | 2:00 B te ta 
R 12 5.64 | 2.14 10 576 | Les 
Coe eat 6.39 | 1.91 ul 5.4 | oh 
8 6.40 | 2.93 6 6. a 
2D-IR 6 7.95 | 0.97 idliay 
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the predicted direction and ¢ tests of dif- 
ferences between conditions were signif. 
eant for IDR > 2DR (p < .05), 2DR > 
C (p < .05), and in turn, C > 1D-IR 
(p < .05). The effect of IQ level and the 
interaction were not significant. 

In the above analysis differences in mean 
trials between conditions were confounded 
with differential frequencies of Ss success- 
fully reaching criterion, since nearly all S83 
reached criterion in some conditions and 
only half in others. We may ask whether 
the predicted differences between condi- 
tions also obtain when we consider the 
trials of only those Ss who successfully 
reached criterion. Mean transformed scores 
were computed in this manner and the re- 
sulting array was nearly identical in order 
to the earlier array for total trials. For nor- 
mal children, 1DR = 3.79, 2DR = 4.28, 
8DR = 4.69, C = 5.44, 1D-IR = 431, 
2D-IR = 7.06. For retardates the cor- 
responding means were 3.73, 4.33, 4.67, 
4.61, and 4.90. An overall analysis of vari- 
ance was not feasible in view of the wide 
range of Ss in each condition, 6-16. Indi- 
vidual ¢ tests for normals yielded substan- 
tially the same significant comparisons 48 
in the earlier analysis. The conditions of 
IDR and 2DR were significantly superior 
to C (p < .001 and < .05, respectively) 
and in turn, C > 2D-IR (p < .05). The 
only significant exception to the predicted 
array of means was the mean trials of 
1D-IR which was not significantly differ- 
ent from that of the relevant verbalization 
conditions. When we consider, however, 
that this mean score of 4.31 is based om 
only half of the original Ss per cell and 
may represent relatively fast learners 
within the cell or reflect some other uncon- 
trolled artifact, we are not surprised that @ 
given score may go against prediction. For 
retardates, the range of mean scores Was 
cireumscribed and the only significant com- 
parison was between 1DR and C (p 
05). In general, it may be concluded that 
fewer Ss in the C condition solved the dis- 
crimination task than Ss given relevant 
verbalization, and that those who did reach 
criterion in the C condition required mot 
trials to do it. 
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TABLE 3 
Mpan TRANSFORMED ToraL TRIALS OF REVERSAL 
Learning IN NoRMALS AND RETARDATES 


Normal Retardates 
Condi- $ 
i jects Subjects 
tee ee M SD reaching M SD 
criterion criterion 
16 3.59 0.51 13 4.23 1.59 
pele [tlre | ml te | ae 
12 Bl r = 4 
Be 5.01 1.25 il 4.33 1.05 
1D-IR 5 5.90 2.43 5 5.46 2.45 
2D-IR 3 6.93 2.00 


Turning to reversal shift, frequencies 
and mean transformed total trials scores 
for both groups are presented in Table 3. 
It is observed that the majority of Ss, 105 
out of 120, were successful in solving the 
reversal shift. Of the 15 failures, only 5 
were retarded, so that retardates did not 
contribute a disproportionate share. Of 
these 15 Ss, 7 were in the entirely irrele- 
vant verbalization conditions and the fail- 
ure ratio was 7 out of 20 as compared to 8 
out of 100 for the remaining conditions. A 
chi-square corrected for continuity yielded 
avalue of 9.90 (df =1,p < 01). 

The array of mean reversal trials was in 
the predicted direction, but the range of 
scores was reduced especially for retar- 
dates where there were no significant dif- 
ferences between conditions. For the nor- 
mals IDR > C (p < .01) while all other 
comparisons were in the predicted direc- 
tion, but did not attain formal significance. 
There were no overall significant differ- 
ences between normals and retardates in 
reversal shift. A comparison of discrimina- 
tion learning versus reversal shift trials 
also yielded no significant differences. As 
might be expected from the similar array 
of means scores by condition in discrimi- 
nation learning and reversal shift, there 
was a highly significant product-moment 
correlation coefficient, .40 and .37 for nor- 
mals and retardates, respectively (signifi- 
cant beyond .01). 


Discussion 
. Modifications in the two-choice discrim- 
{tion teversél paradigm were only par- 
ially successful in demonstrating the fa- 
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cilitative effect of relevant verbalization 
and interference due to irrelevant verbali- 
zation. With reference to discrimination 
learning, evidence is ample that verbaliz- 
ing the relevant cue significantly improved 
performance over a base-line or no-ver- 
balization condition and that verbalization 
of irrelevant cues interfered. Not only did 
more Ss successfully reach criterion, but 
those who did, reached it more rapidly and 
the reverse was true when irrelevant cues 
were verbalized. This finding is consistent 
with the notion that verbal designations in- 
crease the probability that Ss will attend 
to the stimulus parameters to which these 
designations refer. Those parameters that 
are consistently reinforced will be increas- 
ingly attended to, while the attending 
responses that are reinforced on a 50% or 
chance basis are gradually extinguished. 
This explanation agrees with such theoreti- 
cal positions as House and Zeaman (1959) 
and Wyckoff (1952) who have formulated 
concepts of “observing” or “attending” 
responses and allow for a variety of vari- 
ables, verbal labels, novelty, ete. to in- 
crease the probability of attending to a 
given class of cues. 

When we consider the adverse effect of 
the consistent verbalization of irrelevant 
dimensions in 1D-IR and 2D-IR, we 
might well conjecture that the mixed con- 
dition, verbalizing two or more dimensions, 
one or more of which is irrelevant every 
time Ss make a conceptual choice, should 
drastically reduce rate of learning even be- 
low that found in the control condition. 
This was not the case, and 2DR and 3DR 
were as effective, if not more effective, 
than the control condition and were cer- 
tainly more effective than conditions in 
which verbalization was entirely irrele- 
vant. If the net effect was not always bene- 
ficial when compared with this base-line 
condition, it was certainly not detrimental. 
We may conclude that Ss recited irrelevant 
terms without necessarily attending to all 
the referent classes of cues and thereby 
being distracted from the criterial class. 
The Ss did not attend equally to every- 
thing they said, but attended more to those 
verbal cues which were consistently rein- 
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forced and presumably ignored those di- 
mensions which they uttered and which 
were inconsistently reinforced. 

The data from reversal learning does not 
provide equally impressive evidence in 
support of the effects of verbalization. This 
is due partly to a marked reduction in the 
number of Ss in reversal shift because of 
the many Ss who failed to master the 
original discrimination task within the al- 
lotted trials and were not exposed to the re- 
versal shift phase. Differences between 
means of the various conditions were 
generally in the predicted direction and 
with a larger number of Ss per condition 
these differences might have attained for- 
mal significance at least in the case of the 
normal children, 

The finding that overt relevant verbali- 
zation facilitates performance does not 
exclude the possibility raised by Kendler 
that covert verbalization may spontane- 
ously occur in children age 7 or 8. We may 
assert, however, that providing Ss with 
relevant verbal designations apparently di- 
rects their attention to criterial cues more 
rapidly than would occur when Ss are left 
to their own resources. 

The mechanism responsible for the dele- 
terious effect of irrelevant verbalization in 
the reversal shift phase is not entirely 
clear. Generally, the more trials Ss re- 
quired to attain the correct discrimination, 
the more trials that were necessary to at- 
tain the reversal. Since this greater diffi- 
culty between the two phases was experi- 
enced ; by Ss largely in the interfering 
conditions, we could argue that the fact of 
verbalization of irrelevant cues which con- 
tinued throughout both phases served to 
strengthen erroneous attending responses, 
Another Viewpoint would ignore the effect 
of the continuing irrelevant verbalization 
and would stress the failure of these Ss to 
master the original discrimination as 
strongly as Ss in the facilitating conditions. 
At the point of transition to reversal shift 
the former Ss may have met the Same ex- 
perimentally determined criterion level 
without attaining thereby an implicit 
mediating response of comparable dis. 
tinctiveness, It is planned to examine this 
question in a future study comparing re- 
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versal shift with and without continuing 
verbalizations of irrelevant cues. : 
There were no overall statistical differ. 
ences between normals and retardates, but 
the predicted differences found greater 
confirmation for the normal children than 
for the retardates. This was especially 
true in the reversal shift phase where re- 
tardates performed alike regardless of ver- 
balization, while significant effects of ver- 
balization were still attained for the normal 
children, This suggestive finding is con- 
sistent with views of O’Connor and Her- 
melin (1959) and of Milgram and Furth 
(1967) that retardates are especially defi- 
cient when attempting to utilize overt ver- 
bal responses to regulate their behavior. 
There are several implications for educa- 
tional practice. Recitation aloud in the 
classroom may be beneficial, depending 
on what is being recited and what is being 
learned. If children are attempting to dis- 
cover or identify the relevant cues in a 
complex stimulus, verbal responses referring 
to the relevant cues themselves will direct 
attention more immediately and exclu- 
sively to the requisite solution. Not only 18 
the formal criterion of discrimination 
learning attained more rapidly, but the ac- 
tual comprehension of the requisite solu- 
tion is enhanced. If we assume that ease of 
reversibility of an acquired discrimination 
reflects superior conceptual or mediational 
efficacy, then the more rapid attainment of 
the reversal shift precisely by the children 
who had rapidly attained the original dis: 
crimination suggests that the original 
learning was of a_ flexible mediational 
character. Recital of relevant cues d0¢s 
not entirely eliminate the discovery aspect 
of learning, but may reduce sharply thé 
generating of unclear and erroneous by- 
potheses. So advantageous is the directing 
of attention to the relevant cues that the 
children may apparently benefit from ® 
complex verbalization which contains bo! 
facilitating and interfering verbal labels. 
Despite the “noise” in the channel, essé?- 
tial information apparently gets across, 
since a combination of relevant and ire S 
vant verbal cues yields superior perform 
ance to no verbalization at all. hat 
One should not conclude, however, tha 
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recital aloud is invariably helpful. In the 
present situation each instance of recital is 
accompanied by immediate feedback as to 
the correctness of the response made and 
under these circumstances it probably 
would direct exclusive attention to the cri- 
terial and away from the noncriterial cues. 
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ATTITUDE LEARNING IN CHILDREN 
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middle-class children were Ss Fd ee experiment 
which was designed to demonstrate the effects o: classical condi- 
tioning upon atutude behavior. Hypothesis 1 was supported by the 
significant changes which occurred in free play behavior in the case 
of each experimental isolate. Classmates approached experimental iso- 
lates more frequently than they did control isolates. The control iso- 
lates’ behavior and interaction rate fluctuated insignificantly. The 
changed number of interactions remained at a consistent level after 
treatment for 1 wk. Hypothesis 2 was less directly supported because 
the sociogram data were less clearly changed, In one class, however, 
the high-popular children did change significantly. There were no sig- 
nificant changes in the low-popular halves of either class. Measurable 
changes were found after treatment which used a classical condi- 


60 4th- and 5th-grade 


tioning paradigm. 


According to Doob (1947), an attitude 
is an implicit response which is both an- 
ticipatory and mediating in reference to 
patterns of overt responses. An attitude is 
evoked by a variety of stimulus patterns 
as a result of previous learning or of gradi- 
ents of generalization and discrimination 
which is itself cue- and drive-producing. 
Using this definition, the present study in- 
volves three assumptions. (a) Individuals 
may be understood as discriminable stimu- 
lus patterns to which responses may be 
learned, (b) The name of any known per- 
son acts as part of the stimulus pattern 
which will recall the whole stimulus pat- 
tern to the subject’s (S’s) memory. (c) In- 
dividuals in groups have acquired a kind 
of status value or stimulus value which 
makes them important or unimportant, 
friends or nonfriends of the other members 
of the group, varying with the others’ ex- 
periences with them. The unimportant or 
low value person—the conditioned stimulus 
(CS) in this experiment—is defined as one 
who receives few social responses and who 
is Tegarded by no one or by one person 
only in his class as a friend (as meas- 
ee by BLS He was called an 

ate or social isolate f 
this study. Reine once 

Two hypotheses were investigated i 
study. (a) It was anticipated that 1 a 
titude changes as a result of treatment 
would have an effect upon the actual be- 
havior of the previously uninterested group 


members regarding the isolate, i.e. the 
treatment will mediate the overt responses 
of the experimental Ss. (b) Attitude be- 
havior which may be assessed by sociogram 
testing will change after using classical 
conditioning techniques to attach stimulus 
value to the isolate. 

Staats and Staats (1958) associated 
pleasant and unpleasant meaning to words, 
He used these conditioned words to con- 
trol the liking or disliking of new words 
when they were paired with the previous 
CS words. Like and dislike for the words 
in both levels of conditioning were meas 
ured by a rating scale. He further used 
national names (eg., Dutch) as well 98 
the names of people (e.g., Bill) as CBs. 
Depending upon the pleasantness of the 
unconditioned stimulus (UCS), the names 
were rated on a continuum from pleasant 
to unpleasant. ' 

Conditioned behavioral changes in social 
approach behavior had not been noted be 
fore the current work. However, pilot stud- 
ies by Early and Mercer (1961)? demon- 
strated that more children stated that they 
liked a social isolate or neutral stimulus 
after the name of that person had been 
paired with a positive evaluative meaniné 
word (e.g., fun). 

The current study was designed to note 
changes in sociogram ratings as well a8 be- 
d “Attitude 


x ‘ 5 4 
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havioral changes after conditioning treat- 
ment. The individual treatment of each 
student was undertaken in order to deter- 
mine the extent of his memory of stimulus 


pairs. 
Merrnop 


Two classes of children in the fourth and fifth 
grades of the University of California laboratory 
summer elementary school were used as Ss. They 
had attended class for 3% weeks at the onset 
of the study. The sample was a middle-class group 
of children which was accumulated from districts 
in the Berkeley City Area. Of the 60 children, 
95% were from families whose support was derived 
from fathers who were professional people. The 
teachers, observers, and first sociogram indicated 
six children who were. alone most of the time 
during recess or free play periods. Preferred 
acquaintances had already been made. Groups of 
the more popular children were established in play 
activities. 

In order to assess the group structure which 
existed in this temporary school setting, each 
teacher gave the following instructions for the 
first sociogram to his class. 


Please take a fresh sheet of paper and put 
your name on the upper right-hand corner. 
Think, without looking around, of some of the 
people you like in this classroom. Please write 
their names (first name and last jnitial if you 
know it) on the paper, one under the next. List 
as many as you like, but fewer than 10 or 11. 
We will probably be using this information for 
possible seating arrangements in the future. 


A second sociogram similar to the first was ad- 
ministered after experimental treatment. This one 
was also allegedly for the teacher's use. 

Using the first sociogram, each child was ranked 
according to the number of times he was men~ 
tioned by his classmates. The boys and girls in 
each class who received the fewest number of 
votes were designated social isolates. None of the 
isolates initially received more than one vote each. 
The most and least popular boys and girls in 
each class were divided randomly into experi- 
mental (E) and control (C) groups. This proce- 
dure developed eight groups: (a) 9 high-popular 
male experimentals; (b) 8 high-popular male con- 
eae (c) 8 low-popular male experimentals; (d) 
7 low-popular male controls; (e) 7 high-popular 
male experimentals; ({) 6 high-popular female 
maa (g) 6 low-popular female experimentals; 
(h) 7 low-popular female controls. The total 
sample was 60 (30 in each class). The different 
numbers in each cell were a result of different 
numbers of boys and girls per class. 

eae observers were used to note the behavior 
of each isolate for Y hour each of 4 days: before 
experimental treatment, 1 day after treatment, 2 
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days after treatment, and 1 week after treatment. 
The number of interactions between the isolate 
and his school peers was counted. An interaction 
was counted for the isolate if he responded to or 
received a response from another child, that. is, 
hugging, smiling, talking, or wrestling. Two C chil- 
dren (one boy and one girl having nearly the 
same number of votes and interactions as the 
isolates from their class) were observed in order 
to determine the effects of the mere passage of 
time. Without observing such C children, one 
could not be certain whether any of the isolates 
would “warm up” to a group of potential associ- 
ates with repeated contact only. 

The Ss were given a list of words to rate on 
a 3-point scale: “I like,” “I don’t like,” or Gig 
don’t care.” Lists of high-positive evaluative mean- 
ing words and neutral words were developed from 
the children’s responses to the rating scales. The 
most frequently liked words were used as UCSs. 
The words most frequently chosen by the girls as 
“J like” were the following: CONSIDERATE, GOOD, 
FUNNY, FRIENDLY, HAPPY, INTERESTING, PLAYFUL, 
SKILLFUL, KIND, POLITE, NEAT, NICE, GENEROUS, and 
CHEERFUL. The positive evaluative meaning words 
chosen by the boys were the same with the excep- 
tion of pours and the addition of Active and FUN. 
The children were indifferent to the following 
words: AND, IF, OR, A, AN, FOR, OF, TABLE, and CHAIR 

In order to prepare the children for experi- 
mental treatment the experimenter explained to 
the classes that she was conducting a study on 
memory. Each child would have a chance to 
read a list of paired words which would seem 
strange, then the children would be rated on 
their ability to recall as many pairs of words as 
possible. In general, the children were coopera- 
tive and industrious. 

Each child was shown 32 cards after being 
taken individually into a room away from the 
class. Ten of the 32 cards in both E and C groups 
contained the name of the isolate. The other 22 
cards contained names of class members paired 
with a low-valued word. One series of 32 cards 
read and recited (as a memory task) constituted 
one trial. The children were given individual, 
consecutive trials until a total of one-half of the 
pairs were learned and 70% of the conditioning 
pairs were memorized. In this way control over 
the extent of conditioning was established. 

Experimental Group. ‘The Ss were shown the 
stimulus cards with the name of the isolate paired 
10 times with one of the 10 positive meaning 
words. The boys were shown only boys’ names, 
girls only girls’ names. ‘An example of the 
stimulus pairs containing @ CS and a UCS fol- 
lows: 


Karen (isolate’s name or CS) : 
neat (positive evaluative meaning word or 


UCS) 


Every other name (of the same-sexed class group) 
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appeared in the series paired with a nonevaluative 
word as in the following example. 


Mary (nonisolate, classmate) 
AND (nonevaluative word) 


Control group. This randomly selected half of 
each class read the same number of words. How- 
ever, they read the isolate’s name paired with a 
nonevaluative word 10 times in the series. The 
control was used to note any changes in sociogram 
behavior which were not a result of experimental 
treatment. 


Karen (CS) 
‘AND (nonevaluative meaning word or neutral 
stimulus) 


Two forms of control were used, one for socio- 
gram data and one for observational data. Reli- 
ability of social isolation could be established 
jn a case where no treatment was given. If un- 
treated isolates remained at the same level of 
interaction while the treated isolates changed, it 
was reasonable to conclude that experimental 
treatment was responsible for such observed differ- 
ences. 


ANALYSIS AND RESULTS 


Two statistical tests were used in the 
data analysis: (a) the chi-square one-sam- 
ple test for significance of change and (b) 
the Fisher Exact Test. The former test was 
used to examine the significance of change 
in the data obtained from behavioral ob- 
servations of interactions made while the 
children were playing. The Fisher Exact 
Test was used to ascertain the significance 
of change between pretreatment and post- 


TABLE 1 


Numser or Inrmractions Osservep FOR 
EXPERIMENTAL AND ContRoL IsoLaTEs 


After treatment 
rer ataast ~~] deal 
Day 1 | Day 2 | 1 Week 
Hpereate 
‘emale 1 9 
Female 2 ll Fe = a 
Subtotal 20 51 60 52 
ba Aly tte 
lel 8 
Male 2 5 i a pif 
Subtotal 13 54 | 50 
Control 54 
Female 10 10 
12 
Male 1 1 1 
Subtotal ll ll 13 7 
se ie oe [ef Cg 
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IsoLaTs oF THE ComBINED HicuH-PoruLaR 
Mempers OF THE FourtH anp Firra 


GRADES 
Todt | Ngtanee | Change in| Ty 
Control 12 3 15 
Experimental 8 7 ot 
Total S 21 9 30 


TABLE 2 
IncrEMENTS IN NuMBER oF VorTEs For THE 
! 


treatment sociograms. Results were con- 
sidered significant if obtained p values 
were better than or equal to .05. The socio- 
gram data were analyzed by class, sex, 
popularity, and E and C groups (Siegel, 
1956). 

The increment in number of observed 
social interactions after the initial observa- 
tions was significant at the p equal to 01 
significance level. For all treated isolates 
this increment was two to three times the 
initial rates. Table 1 contains raw data de- 
rived from observations of both experimen- 
tal and control isolates. The increment of 
interactions in the two control cases was 
not significant. The first hypothesis which 
stated that there are behavioral changes 
following classical conditioning procedures 
was confirmed. 

The remaining data, regarding Hypoth- 
esis 2, are presented for the combine 
classes and Classes 1 and 2 which corte 
spond to fourth and fifth grades. The num- 
ber of votes by the high-popular children 
for the isolates was compared on Socid- 
grams 1 and 2. The Fisher Exact Test was 
used to analyze these data. The second pre 
diction, that classical conditioning woul 
change the children’s statements regarding 
whom they liked, was not confirmed. 

Table 2 combines the two classes for the 
results of the votes from the high-popult 
groups. Changes in the low-popular half ) 
the class were so low that no analysis Was 
done. The changes for the combined boy® 
and girls’ E groups for the high-populat 
half of the class were not significant. 

Tables 3 and 4 contain the data ae 
the fourth grade class. Table 3 shows ®', 
changes in the combined girls’ and yas: 
high- and low-popular, E and C groups 
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INCREMENT OF Vormrs FoR ISOLATE AFTER 


TABLE 3 


TREATMENT, FOR FourtH GRADE 


Boys Girls 
Isolates “4 ‘Total Ss 
E fe} E c 
High Popularity 2 0 3 0 15 
Low Popularity 0 1 0 0 15 
Total Ss 8 8 7 7 30 


These findings were not significant. Table 
4 shows the changes for the high-popular 
Band © groups from Class 1. The increase 
in the number of votes from the first to 
the second sociogram was significant. The 
effects of treatment are clear here. 

The second class contained 6 fourth- 
graders and 24 fifth-graders. None of these 
results was significant. The data in Table 
5 illustrate that there were as many voters 
in the control group for the isolate as there 
were in the experimental group among the 
more popular children. 

Conditioning of a few children appears 
to have been effective in both classes as 
measured by their vote changes for the iso- 
late on the sociograms. However, the num- 
ber of the children who were conditioned, 
as so measured, was significant in only one 
part of one class. Increments in the con- 
trol group explained the insignificant re- 
sults in the combined classes. In no case 
were the results of conditioning treatment 
in the low-popular group of Ss significant. 
Only 3 children of 60 voted for the iso- 
late in this group, one of whom was in the 
control series. 

The first hypothesis was strongly Sup- 
ported, The children did begin responding 
to the same-sexed isolate in their class 
after conditioning treatment. The second 


TABLE 4 
Vortne Increments IN THE HigH-PoPpuLAR 
EXPERIMENTAL AND ConTROL GROUPS FOR 
FourrH GRraDB 


Isolates No change | Change in| Total Ss 
in vote vote 
Stress ke ea a 
Control 8 0 8 
Experimental 2 5 7 
Total Ss 10 5 15 
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TABLE 5 


Increment oF Voters For IsoLaTE AFTER 
Treatment, FourTH AND Fira GRADES 


CoMBINED 
Boys Girls 
Isolates Total Ss 
Bye |i wap he 
High Popularity 1 0 2 3 15 
Low Popularity a 1 0 0 15 
Total Ss 9 9 6 6 30 
hypothesis was clearly supported by the 


data from high-popular children in one 
class only. Sociogram behavior was not 
changed significantly as was actual play 
behavior. i 

There were no significant differences in 
boys’ increments between Sociograms 1 
and 2. The E girls’ changes did not achieve — 


significance over the C girls’ group in 
either combined classes or separate 
classes. 

Discussion 


Initial observations of the isolated chil- 
dren revealed that they tended to be unre- 
sponsive to others. They did very little 
during free play at recess and only infre- 
quently spoke to other children. If they 
spoke at all it was to someone smaller or 
younger than themselves. During the pe- 
riod of treatment and observations there 
was a noticeable change in the quality of 
their behavior. All four of the isolates be- 
came more animated, participated in games 
with other children when they had not 
done so previously, and chose children from 
their own class as playmates. In one case, 
immediately after treatment, a boy isolate 
left the younger children’s game and 
joined his own class for the first time, 
where he stayed during the remainder of 
the observation periods. Conditioning 
treatment of his peers produced responses 
to his social overtures s0 that they no- 
ticed him and began to offer themselves as 
social reinforcers. Because the isolate’s 
peers reinforced his approach behavior, it 
was continued when extinction might have 
occurred without reinforcement. 

Qualitative data seem to explain the 
findings from the second class pertaining to 
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Hypothesis 2. In the process of arranging @ 
random, stratified sample, the peer groups 
of the high-popular children were broken 
into experimental and control groups. This 
artificial division did not reduce the group 
preferences which were established before 
the experiment, Thus, a treated child ap- 
proached the isolate accompanied by his 
friends who were in the C group. Probably 
because a few of the experimental popular 
children associated with the social isolate, 
the control children learned to value the 
jsolate and therefore voted for him as a 
person whom they liked. Nonexperimental 
conditioning seemed to have occurred. The 
popularity of one child was associated with 
the isolate, thus probably making him ac- 
ceptable to other high-popular children. 

The method of selecting the controls, 
then, made the experimental effects clear 
as well as the effects of indirect condi- 
tioning. In the second class, the experi- 
menter conditioned three children, and the 
treated children conditioned as many. The 
results from the second class were not sig- 
nificant at p better than or equal to .05; 
however, the first class’ data supported the 
second hypothesis. 

The low-popular children included un- 
treated isolates who were quiet, inactive, 
and unresponsive. In general they reported 
fewer children whom they liked. Many of 
these Ss actually avoided interaction. They 
required more trials to the learning cri- 
terion. Further work in this area might 
demonstrate some personality differences in 
Tesistance, passivity, defensiveness, or in- 
troversion. On the other hand, the high- 
and medium-popular children seemed to be 
more willing to participate in class discus- 
sions, games, and memory tasks, They 
were more socially active, hence more re- 
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ceptive and responsive to stimuli outside of 
themselves. 

A previous, unpublished study by Early 
and Mercer (see Footnote 1) demonstrated 
that high-popular children, particularly 
girls, would be more likely to state that 
they liked the isolate after conditioning 
treatment than would high-popular boys, 
The present study also found that the 
raw score changes for girls were greater 
than for the boys despite the lower number 
of girls. It seemed that girls were more 
conditionable or suggestible than the boys 
because of their greater changes after 
treatment. This tendency was not signifi- 
cant, however. 

Further work in the area might include 
assessing personality differences between 
high- and low-popular children, IQ differ- 
entials may account for the decreased 
ability to memorize lists of pairs. If the 
less bright fall into the low-popular group, 
they, therefore, may seem less attractive 
to the brighter, more verbal, popular chil- 
dren, Other personality reasons for 4 
child’s tendency to remain socially iso- 
lated might include shyness, fear, or dis- 
interest in the world outside of them- 
selves. The last case implies that the 
genuinely introverted child has chosen to 
remain unnoticed. 
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ACHIEVEMENT MOTIVATION AND THE RECALL OF 
INCOMPLETED AND COMPLETED EXAM QUESTIONS’ 


BERNARD WEINER, PATRICK B. JOHNSON, np ALBERT MEHRABIAN 
University of California, Los Angeles 


Male students indicated their level of aspiration on a final exam, 
and subsequently were asked to recall the exam items. 2 measures of 
resultant achievement motivation, 1 objective and 1 in part projec- 
tive, were employed to classify Ss into motive groups. For both meas- 
ures Ss high in resultant achievement motivation recalled a greater 


percentage of failed than passed 


questions (the Zeigarnik effect). In 


addition, they exhibited a greater Zeigarnik effect than Ss low in re- 


sultant achievement motivation. 


The differential recall was due to 


greater remembrance of the failed items by the high achievement- 


oriented Ss. It is hypothesized 
hearse and think 
low in achievement motivation. 
the Zeigarnik effect is a learning 
Only the projective 
aspiration. 


that these students covertly re- 


about the missed questions more than students 


. Therefore, it is contended that 
rather than a memory phenomenon. 


measure revealed group differences in level of 


The interrupted task paradigm was intro- 
duced into psychology by Zeigarnik in 1927. 
Individuals receive a number of tasks to 
complete, and are interrupted before finish- 
ing some of them. Following this activity, 
they unexpectedly are asked to recall the 
tasks, Zeigarnik found greater recall of the 
incompleted (I) than completed (C) tasks 
(the Zeigarnik effect). The differential re- 
call was believed to support Lewin’s (1935) 
conception of enduring tension systems. 

There was a partial reversal of Zeigar- 
nik’s results when studies of task recall 
were conducted in America. Many inyesti- 
gators (e.g., Glixman, 1949; Rosenzweig, 
1943) found greater recall of the C than I 
tasks in “ego-involved” situations. It was 
reasoned that it is “threatening” to re- 
member failure (I) experiences; the mate- 
rial associated with failure therefore is “re- 
pressed.” 

Atkinson (1953) in part resolved the ap- 
parent contradiction between the findings 
of Zeigarnik and Rosenzweig. He demon- 
strated that individuals classified as high 
in need for achievement ( n Ach) remember 

*This research w: rted in whole by 
pals Service Researal ‘Grant No. MH- 
rom the National Institute of Health awarded 
e the senior author, The authors wish to thank 

‘aul Feldman, Carol Price, and Penelope Potepan 
for their aid in conducting the experiment. 


more I than © tasks in achievement-ori- 
ented contexts. Conversely, subjects (Ss) 
who are low in n Ach and considered rela- 
tively anxious about failure remember more 
C than I tasks. Analysis of Ss used by 
Rosenzweig indicated that they were re- 
ceiving services from the psychological 
clinic, and presumably would be character- 
ized as relatively anxious. This was not true 
of the population used by Zeigarnik, Atkin- 
son argues that the different SS populations 
were responsible for the contradictory re- 
sults of Zeigarnik and Rosenzweig. Atkin- 
son also found that the differential Zeigar- 
nik effect was attributable to differences in 
the recall of I, rather than C, tasks. He 
reasons that the remembrance of I tasks is 
instrumental to the attainment of achieve- 
ment-related goals; these goals are strived 
for by individuals high in n Ach, but 
avoided by individuals relatively low in 
n Ach. (The reader is directed to Butter- 
field, 1964, and Weiner, 1966a, for a more 
detailed review of experimentation in this 
area.) 

In the present study Atkinson’s sup- 

sitions are investigated in a real-life 
achievement setting. Following a final ex- 
amination students were asked to recall the 
exam items. These circumstances should 
maximize aroused achievement motivation 
and task involvement, and therefore mag- 


181 


182 


nify previous findings (Atkinson, 1953; 
Zeigarnik, 1927). Two measures of result- 
ant achievement motivation (n Ach minus 
anxiety) were employed to classify Ss into 
motive groups. One classification method 
included a Thematic Apperception Test 
(TAT) to measure n Ach, and a Test 
Anxiety Questionnaire (TAQ) to assess 
level of anxiety. Subtracting the z score on 
the TAQ from the z score on the TAT 
yields a measure of resultant achievement 
motivation. This method of grouping indi- 
viduals is used most extensively in current 
studies of achievement motivation (Atkin- 
son, 1964), The second measure of resultant 
achievement motivation was deyised by 
Mehrabian (in press), and is similar in 
principle to a self-report measure con- 
structed by O’Connor (1962). The O’Con- 
nor scale has been used with some success 
(e.g., Weiner, 1966b) to classify Ss into 
motive groups. 


Mernop 


The 8s were 205 students enrolled in the under- 

graduate personality course at the University of 
California, Los Angeles. On the second day of 
class'a TAT, picture series 2, 8, 4, 48 (Atkinson, 
1958) was administered under neutral conditions 
(McClelland, Atkinson, Clark, & Lowell, 1953). 
All pictures were highly cued for achievement. 
The story protocols were scored for n Ach by a 
trained rater according to a reliable method of 
content analysis (Atkinson, 1958) Evidence sug- 
gests that the TAT measure of n Ach is not a 
valid indicator of achievement strivings for all 
females (Atkinson, 1958; French & Lesser, 1964). 
Only the male sample (N = 82) was used in the 
final data analysis. Following the administration of 
the TAT, the TAQ was distributed (Mandler & 
Sarason, 1952). This is a self-report measure of 
situationally aroused anxiety. The items were 
scored on a 5-point Likert Scale. As previously 
indicated, 2 scores on the TAT and TAQ were 
computed, and an index of resultant achievement 
motivation derived by subtracting z score on the 
TAQ from the z score on the TAT. The 8s in 
the top and bottom 25% of the distribution were 
respectively classified as high or low in resultant 
achievement motivation, while the remaining 50% 
of the sample comprised the middle group, 

The final test administered was a Measure of 
resultant achievement motivation devised by 
Mehrabian (in press), The test, labeled the Re- 
sultant Achievement Motivation Scale (RAM) 


*The protocols were scored by Patrick John. 
son. Interrater reliability with jor author 
had been established tobe hal a nine 
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includes 34 items primarily derived from a theory 
of achievement motivation formulated by Atkin. 
son in 1957 and from data supporting that con- 
ception, Atkinson and other investigators haye 
demonstrated that individuals high in resultant 
achievement motivation engage in achievement- 
related activities, anticipate success, and prefer 
tasks of intermediate difficulty. The test items 
tap the direction of behavior, approach or avoid- 
ance, exhibited in achievement contexts; the kind 
of affect, hope or fear, associated with achievement 
situations; and the degree or risk, intermediate 
versus easy or difficult odds, preferred. Sample 
items are: 


1. In my spare time I would rather learn a game 
to develop skill than for recreation. 

2. I worry more about getting a good grade than 
I worry about getting a bad grade. 

3. I would prefer a job which is important, diffi- 
cult, and involves a 50% chance of failure, 
to a job which is somewhat important but not 
difficult. 


Items were rated +3 (very strong agreement) 
to —8 (very strong disagreement). The 10-week 
test-retest reliability of the measure is r = .78; 
the item-total correlations range between 2-5, 
Again Ss high or low in resultant achievement 
motivation comprised the top and bottom 25% 
of the distribution. 

Ten weeks after the individual difference test 
administrations, the students were given a 58 
question final examination. At the top of the test 
the following question was written: 


Iam trying to get correct out of 58. 


The responses to this question provide an index 
of level of aspiration on the exam (Lewin, Dembo, 
Festinger, & Sears, 1943). The test format was 
“fl in the blanks.” When the students handed in 
their exams, they were given a paper with the 
following written instructions: 


Please start at the top of the space below and 
write, as they occur to you, the items on the 
exam. Do not worry about spelling or how exact 
your memory of each question is. You do not 
have to write the whole question, but just 
enough for us to be able to identify it. The 
questions need not be written in the order a 
which they appeared on the exam. Look at i 
clock and take three minutes to do this. At the 
end of that time turn in the paper. 


To provide an objective index of task comple- 
tion, test items missed were considered incom” 
plete or failed, while correct items were considere' 
complete or successful. Some failed items U™ 


*Further details of test construction, reliability, 
and validity will be presented in & forthco 
paper by Mehrabian. 
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RESULTS 


The correlation between the two indexes 
of resultant achievement motivation was 
positive and significant, but relatively low, 
r = 30, p < 01. The mean number of 
correct answers on the exam was 46. There- 
fore, there were many more C than I items. 
Exam performance was virtually identical 
for the three motive groups when classified 
with either resultant motivation index (F < 
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Table 1 gives the percentage of I-per- 
centage of © recall and percentage of Ss 
exhibiting a Zeigarnik effect for the three 
motive groups. The table indicates that the 
likelihood of a Zeigarnik effect is monotoni- 
cally related to the strength of resultant 
achievement motivation. This occurs when 
either the TAT-TAQ or RAM is used as 
the motive measure. Among Ss high on the 
TAT-TAQ index, 15 out of 20 (p < .05) 
exhibit a Zeigarnik effect, while of the 21 
Ss high on the RAM, 17 recall a greater 
percentage of I than C items (p < 01). 
The difference between the proportion of 
Ss in the extreme groups recalling @ greater 
percentage of I than of C questions ap- 


proaches statistical significance with the ~ 


TAT-TAQ index, (2 = 1.67, p < -10), and 


TABLE 1 
Recats or Incorrect Minus Correct IreMs AND 
Leven or AsPrRATION FOR GROUPS DIFFERING 
In STRENGTH or RESULTANT ACHIEVE- 
ment Motivation 


Motive Measure 


TAT-TAQ 


N 

Percentage of Ss 
percentage of| 

Ti > ipercentagesot 


recal 
Percentage we I-per-| 

tage of 
Aspitetion lovato | kD 
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INCOMPLETE 
Do compete 


PERCENTAGE RECALL 
3S 


MOTIVE CLASSIFICATION 
Fic. 1. Percentage recall of correct and in- 


correct items when Ss are classified according to 
strength of resultant achievement motivation 
(TAT-TAQ). 


is significant when the RAM is the motive 
measure (2 = 2.26, Pp < .05). The predic- 
tion for the differential recall is best sub- 
stantiated when both indexes are employed 
simultaneously to classify Ss into motive 
groups. Of the 9 Ss high on both measures, 
8 show a Zeigarnik effect; only 2 of the 6 
Ss low on both resultant measures recall a 
greater percentage of I than C test items 


The figure reveals 
clearly that the Zeigarnik effect shown in 


recall of I items. The Ss high in resultant 
achievement motivation recall a greater 


(2 = 
1.82, p < -10)- The pattern of results is 
classified according 
(z = 1.68, 9 < 10). 


Level of Aspiration 

Table 1 also shows the level of aspiration 
for the three motive groups. When the 
TAT-TAQ is employed to classify Ss, there 


AAA 


184 


in aspiration level between the extreme 
groups, ¢ < 1. 


Discussion 


The data indicated that assessment in- 
struments which supposedly measure the 
same personality dimension only correlate 
moderately with one another. However, the 
measures were equally valid in their pre- 
diction of task recall. It appears that the 
objective resultant measure complemented, 
rather than supplemented, the partly pro- 
jective assessment technique. Yet only the 
TAT-TAQ index significantly differentiated 
the motive groups in terms, of aspiration 
level. Level of aspiration in the present 
context denoted an imaginative achieve- 
ment goal. Many students indicated that 
their goal was to answer all the questions 
correctly. The responses were similar to 
those of Ss asked what they are “hoping” 
for, as opposed to what level they actually 
are attempting to reach (Lewin et al., 
1943). Recently, Wallace (1966) has argued 
that: “the closer the approximation of the 
role-playing situation to the predictive 
situation, the greater...the accuracy of 
the prediction [p. 136].” In the present 
experiment the partial fantasy measure 
predicted fantasy behavior better than the 
objective index. It is conceivable that the 
objective and fantasy indexes of resultant 
achievement motivation will, at times, 
successfully predict different achievement 
behaviors. In sum the data presented here 
do provide validity for the RAM as a mo- 
tive measure, and suggest that this is a 
promising instrument for the prediction of 
some achievement strivings, In addition, be- 
cause the RAM items were derived from a 
theory of achievement motivation, the posi- 
tive results tend to validate both the theory 
and the measure (Cronbach & Meehl, 1955). 
The findings concerning task recall repli- 
cate results reported previously by Atkin- 
son (1953). The Zeigarnik effect is pre- 
dominantly manifested in Ss high in 
resultant achievement motivation, and is 
caused by the differential recall of I tasks 
But differential recall does not necessarily 
reflect memory disparities. Retention is con- 
ceptualized as a multistage process. The first 
stage is learning, or trace formation. The 
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temporally subsequent stages are trace stor- 


age and trace retrieval. Only the latter two - 


stages are adjudged to be memory procesges, 
To show differences in the memory of 
events, there must be equality in the degree 
of original learning of the material. If Ss 
learn I tasks to a greater degree than 0 
tasks, then one also would expect differen- 
tial recall of those tasks. Caron and Wallach 
(1957) have demonstrated that the Zeigar- 
nik effect is due to differential learning 
rather than to differential retention. In 
their study Ss were told that the I tasks 
were insoluble after the initial recall period 
was completed. Therefore, there was no 
persisting source of motivation for rela- 
tively anxious Ss to repress the tasks, nor 
any instrumental inducement for Ss striving 
for success to retain the material. Following 
this information about the insolubility of 
the I tasks the differential retention found 
at the end of the first recall period should 
dissipate. However, customary differences 
in the pattern of recall between the motive 
groups were observed after the feedback. 
Caron and Wallach therefore concluded 
that the recall differences must be attrib- 
uted to differential learning, rather than 
to differential memory. 

The results of the study by Caron and 
Wallach, combined with the present data, 
suggest that Ss high in resultant achieve- 
ment motivation learn I tasks to a greater 
extent than Ss low in resultant achievement 
motivation. Learning is in part a function 
of the number of repetitions of the stimulus. 
It is hypothesized that Ss high in resultant 
achievement motivation covertly repeat 
and rehearse questions which they have 
missed more than low-achievement, indi- 
viduals. Weiner (1965) has summarized & 
number of studies which reveal that 8 
high in achievement motivation are at 
tracted toward tasks which they have 
tially failed. Conversely, Ss low in achieve- 
ment motivation especially avoid tasks 
which they have not been able to complete. 
Because students low in resultant achieve; 
ment motivation avoid failed or incomplete 
test items, they are less likely covertly as 
repeat and remember those items than t ‘ 
high resultant achievement motivation a 
dents. This analysis implies that if studen! 


ee 


_—. 


are not allowed to return to the failed tasks, 


then the differential Zeigarnik effect would 
not be exhibited. The differential persistence 
at failed tasks also might be responsible for 
the disparate grade point averages that are, 
at, times, manifested by the groups (see 
McClelland, et al., 1953). 
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the single 


unifying structure, the sentence, performed significantly 


better than did Ss who learned under nonunified conditions. These 
results contradict those of an earlier experiment in which it was 
concluded that verbal organization does not facilitate performance 


in a serial task. 


Recently, considerable attention has 
focused on the question whether or not 
identical processes are involved in learning 
the serial order of items in a list and in 
learning the pair-wise arrangement of 
items in a list. The most frequently used 
method in attempts to answer this ques- 
tion has been that of transfer designs, 
either serial (Ser) to paired associate (PA) 
or PA to Ser. Although evidence thus far 
produced by the application of this method 
is not yet entirely conclusive, some in- 
vestigators have construed available re- 
sults as implying that the processes of Ser 
and PA learning do indeed differ. Jensen 
(1962), for example, has contended that the 
learning of a Ser list consists of the inte- 
gration of a sequence of responses into a 
single unit rather than of the acquisition 
of connections between successive eliciting 
stimuli and their companion responses. 
Adopting this contention (Jensen & Roh- 
wer, 1965a), Jensen and Rohwer (1965b) 
have gone on to characterize one of the 
differences between Ser and PA learning 
in terms of the relative importance of past 
verbal experience for the two kinds of 
learning: 


In short, we hypothesize that PA learning abili 
reflects relatively more the richness of S's Pate 
verbal experience and its spontaneous availability 


* This work was supported, in by 
tract with the United States Office of Education 
(OE6-10-273) through Project Literacy, The report 
was prepared at the Institute of Human Learning, 
which is supported by grants from the National 
oa Foundation and the National Institutes of 
Ei le 


in a learning situation, while serial learning con- 
stitutes a more fundamental kind of ability which 
is relatively unaffected by the amount of previous 
verbal experience [Jensen & Rohwer, 1965b, p. 
602). 


The purpose of the present experiment 
is to disentangle the hypothesis of response 
integration as a description of Ser learn- 
ing and the assertion that the availability 
of previous verbal experience is irrelevant 
to the efficiency of Ser learning. The 
validity of the latter hypothesis depends, 
in part, on the results of a study (Jensen & 
Rohwer, 1965b) conducted to test one of 
its implications, namely, that verbal or- 
ganization of PA items should facili- 
tate acquisition whereas verbal organl- 
zation of Ser items should not. The 
tasks of learning a 10-pair PA list and 4 
10-item Ser list were given to children of & 
variety of grade levels. In the case of both 
tasks, the treatment and control condi- 
tions were distinguished only by the proce 
dure followed on the initial study trial 
For the latter, the PAs and the Ser items 
were simply shown successively and 8 was 
asked to name the object in each picture 
as it was shown to him. In the treatment 
condition for the PA task, S was asked 
to elaborate the names of the two 0b 
jects in each pair into a sentence, one = 
tence per pair. Similarly, the treatmen 
condition for the Ser task required 8. 
elaborate the names of each successive 
pair of objects into a sentence, two items 
per sentence. Note that this procedure #8 
consistent with the conception that Ser 
learning consists of the acquisition of con- 


186 


‘Verbal ORGANIZATION AND THE Facrurration or SeataL LEARNING 


between successive items each of 


nections A 
both as a stimulus and as & 


which serves 


response. 
The results for elementary school chil- 


dren replicated those previously obtained 
for mentally retarded adults (Jensen & 
Rohwer, 1963): The sentence condition 
produced substantial facilitation of PA 
put not of Ser learning. Accordingly, the 
investigators concluded in favor of their 
original hypothesis regarding the irrele- 
yance of verbal organization for Ser learn- 
ing and went on to say: 

If a true difference between the sentence and 
naming conditions were found to exist, we would 
be inclined to interpret the difference as being at- 
tributable to facilitation of response learning 
rather than to facilitation of serial learning per se 
[p. 606]. 

In contrast to these conclusions, the 
guiding hypothesis for the present study is 
that verbal organization is relevant to 
Ser learning but only if the type of organi- 
zation imposed is consistent with what is 
ordinarily learned when & Ser list is ac- 
quired. In agreement with Jensen (1962) 
and with Jensen and Rohwer (1965a) it is 
assumed that the learning of a Ser list con- 
sists of the process of integrating the items 
into a single response. On this assumption, 
the absence of facilitation previously re- 
ported (Jensen & Rohwer, 1963, 1965b) 
would be expected; the kind of verbal or- 
ganization used, that is, successive dise 
sentences, is not consistent with what is 
presumably acquired in Ser learning. A 
different type of verbal organization, spe 
cifically, one that confers on all items in 
the Ser list membership in a single unit, 
would be expected to produce facilitation 
not accountable in terms of enhanced re- 
sponse learning. The present experiment 
was designed to test this prediction. 


Merxop 


Design and Materials 


; All Ss were given a common task, namely, to 
leam the serial order of one OF the other of two 
lists of 14 familiar nouns. The design was 
i 2 ~ : X 4 factorial in which the scenes ies 

ectively : conditions, lists, grades, an ials. 
The various conditions differed only with regard 


to the character of the one study trial during 
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which the list was first presented; thereafter all 
were identical. 

The six conditions may be viewed as an ag- 
gregation of an experimental and five control 
groups. The study-trial materials for the experi- 
mental or single sentence (8S) condition were 
constructed to conform with the requirement that 
all of the items in the list be contained within 
the same verbal unit. Each of the 14 nouns was 
presented in the context of a three- or four-word 
phrase. The critical property of the phrases was 
that when read in the p ibed order, they 
formed one continuous, meaningful sentence. Thus 
the study-trial materials for the SS condition were 
conceived to be a concrete expression of a verbal 
organization consistent with the interpretation 
of serial learning as & process of integrating a 


consisted of the 14 nouns in the list presented 
successively in accord with the traditional Ser 


procedure. 
Since it was expected that learning would be 


sentence. Accordingly, in the phrase control (PC) 
condition, the study-trial materials consisted of 
a set of 14 unrelated phrases, 
nouns in the list. The 
added function of providing a comparison with 8S 
jn which noun study time was equated, as was 


particular words used in the verbal contexts were 
ne than those used in SS. By 
way of obviating this difficulty, the same phrases 
used in SS were presented in a scrambled order 
in the scrambled sentence control (SSC) condi- 
tion such that their succession did not form a 


of items 4 
evaluate directly the effects of SSC on serial 
Jearning, the remaining condi fons in the design 


————— enn 
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TABLE 1 
List A MaTERIALs 
Noun 
Single sentence (SS) Phrase control (PC) control 
the grey CAT the grey cat car 
jumped over the to we jumped the Log Loa 
and crossed the street | I crossed the sTREET STREET 
to find the BowL you find the Bown BOWL 
of cold MILK some cold MILK MILK 
under the car his own CHAIR CHAIR 
in the new HOUSE our nice new HOUSE HOUBE 
by the blue Lake a little blue LAKE LAKE 
where the young BOY my fine young BOY BOY 
lost his left sxoz he lost his sHoz SHOE 
while eating the ris she’s eating the Fisi isi 
‘on the wooden Boat B BOAT 
during the storm 
that came last YRAR 
—_ 
Scrambled sentence Scrambled phrase ay 
control (SSC) control (SPC) ESL 
(SNC) 
that came last year they came last YEAR YEAR 
the grey cat the grey caT oar 
of cold minic some cold MILK MILK 
where the young Bor —_| my fine young vor ‘BOY 
and crossed the strut | I crossed the sTREET STREETZ 
on the wooden BoaT an old wooden Boat BOAT 
lost his left som he lost his sxoB SHOE 
to find the Bown you find the Bow BOWL 
while eating the riz _| she’s eating the visu vist 
jumped over the Log we jumped the Log Loa 
in the new HOUSE our nice new HOUSE HOUBE 
under the cHArR his own CHAIR CHAIR 
during the sronm that awful stor STORM 
by the blue rake a little blue LAKE LAKE 


The three remaining factors were: lists, grades, 
and trials. Two distinct lists of nouns were used 
to reduce the risk that results would be specific 
to one set of items. Children were drawn from 
two grade levels rather than one only to provide 
a sample of adequate size, not to test hypotheses 
as to age differences. 


Procedure 


When S entered the room, the e i 
(E) told him that he was to sda aay gop 
nouns (or nouns in phrases) in the order in which 
they were presented. The instructions described 
the procedures that would be followed in the 
study trial and in the anticipation trials as well as 
the type and timing of the responses expected. 

All materials were presented on a memory 
drum. Immediately after the instructions, the 14 
Successive nouns (or phrases) were shown at a 
4-second rate and, as each one appeared, it was 
read aloud by #. Following the study trial an 
asterisk appeared and S had 4 seconds to supply 
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the first noun. The first noun then appeared, ay, 
§ had another 4 seconds to offer the second now, 
and so on through the list to the end of the first 
anticipation trial. Three more anticipation trials 
were given for a total of four in all. 

It is important to note that in all conditions 
only the nouns themselves were presented during 
the four anticipation trials. In other words, 8s in 
the sentence and phrase conditions were given a 
lg context only on the initial presentation 
trial. 


Subjects 


Ninety-six fourth- and fifth-grade children from 
a school serving a middle class residential area 
participated in the experiment.’ Forty-eight chil- 
dren from each grade were randomly assigned to 
the six experimental conditions. All Ss were tested 
inidividually by EZ. 


ReEsuLTS 


The dependent variable in this study was 
the number of correct responses given by $ 
over the four anticipation trials. A re 
peated measures analysis of variance was 
performed on the data. The analysis of 
variance table is presented in Table 2, All 
hypotheses were tested with the probability 
of a Type I error equal to .05. It may be 
seen from Table 2 that there are three 
significant sources of variation, namely 
conditions, trials, and grades x trials. 

The trials effect was expected, and a0- 
counts for about 57% of the within vall- 
ance. The grades X trials effect may be 
traced to the slightly superior learning rate 
of Ss in the fifth grade. ( 

It is of particular interest that there 18 
no main effect for either grades or lists, and 
that none of the interactions involving 
these factors in the between portion of the 
table is significant. The mean number of 
correct responses per trial as a function 0 
conditions and lists is presented in Table 
3. 

Within the main effect of conditions 
Scheffé’s method for post hoc comparisons 
reveals that the SS group differs from ea? 
of the other groups, and that no other pall 
wise contrasts are significant. The 00? 


their aP 
ff at John 
ation 1 


*The authors would like to express 
preciation to Elmer Venter and his sta! 
Muir Elementary School for their cooper 
the execution of the present study. 
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TABLE 2 
Anatysts OF VARIANCE TABLE 

Source af MS F 
Between subjects 95 20.30 
Grades 1 6.51 — 
Conditions 5 124.06 8.28* 
Lists 1 20.17 1.35 
Gxc 5 9.91 _— 
Gx 1 1.04 — 
CcxL 5 24.42 1.63 
GxcxL 5 6.13 ay 
Error 72 14.98 
Within subjects 288 5.06 
Trials 3 275.91 128.33* 
GxT 3 6.61 3.07* 
CxT 15 2.73 1,27 
LXxT 3 5.42 2.52 
GxcxT 15 1.57 = 
Q@xLxT 3 1.67 = 
CxLXxT 15 1.30 ma 
GxCxLxT 15 2.57 1.20 
Error 216 2.15 
Total 383 8.84 

*p < .05. 


parison involving SS versus the average 
of all other conditions combined accounts 
for 91.5% of the total between conditions 
sum of squares, and the sum of squares 
for all other available orthogonal com- 
parisons is not significant. As an inspection 
of Table 3 indicates, the results are clear; 
88 did facilitate learning relative to the 
ordinary serial procedure condition, NC, 
and the magnitude of facilitation was aS 
great on Trial 4 as it was across trials. The 
additional fact that NC produced as many 
correct responses as each of the other con- 
trol conditions contraindicates an interpre- 
tation of the facilitory effect of SS in terms 
of an enhancement of response learning. 


Mf TABLE 3 
nan Numpurs or Correct RusPONSES ACROSS 


Triats anp on TRIAL 4 
; Conditions 
Lists 
ss | nc | ec | ssc|snc| sec) Al 
a 7.62 | 5.00 | 4.19 | 6.25 | 5.62 | 3-91 | 5-28 
$ ; 06 | 4.19 | 5.25 | 5.62 | 3-91 | 3 
Teenerenae: |e ea Tas | 2.84 | 4.03 | 4.58 | 4-82 
; 79 | Sta | 4.ze | 4.05 | 4.83 | 4-23 | 8-08 
‘Trial 4 10:12 | 7.25 | 6.00 | 5.75 | 6.50 | 5.44 6.84 
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Discussion 


The results of the present study support 
the initiating hypothesis, namely that ver- 
bal organization is relevant to the integra- 
tion of a sequentially ordered set of re- 
sponses. Even though this conclusion is in 
direct opposition to that reached by Jen- 
sen and Rohwer (1965b) it is not incon- 
sistent with their interpretation of serial 
learning as a process of response integra- 
tion (Jensen & Rohwer, 1965a). Indeed, 
the present results may be construed as in- 
direct evidence in support of that inter- 
pretation since the form of verbal organi- 
zation employed follows from it. 

A fruitful theory of what is learned in 
serial learning ought to have implications 
for the design of conditions to facilitate 
that process. The adequacy of the theory, 
then, depends, in part, upon whether or 
not the facilitative procedures that can be 
derived from it serve to increase learning 
efficiency. Although the present results are 
suggestive, they are not sufficient to permit 
a conclusive judgment in this regard, Ac- 
cordingly, it is of some import to conduct a 
comparative experiment designed to assess 
the relative efficacy of facilitative condi- 
tions derived from the principal theories of 
serial learning. 

The problem of the effect of SS on re- 
sponse learning deserves brief additional 
comment. In the present design, no provi- 
sion was made for a direct assessment of 
the degree of response learning as a func- 
tion of study-trial conditions. Neverthe- 
less, it is difficult to discern in the SS 
phrases any properties relevant to the 
efficiency of response learning that are not 
also present in the PC phrases. Thus, our 
interpretation is that verbal organization 
of the appropriate type affects the process 
of serial learning directly. 

Two other problems 
investigation are suggested by the present 
results. The first concerns the effect of sen- 
tential organization on the form of the 
serial position curve. That is to say, 10 1 
pertinent now to examine in more detail 
the process of facilitating response inte- 
gration as reflected in the numbers of 
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items learned per trial and in the order 
in which they are learned. If the verbal 
context provided is critically involved 
in this process, variations in sentence 
properties such as phrase structure should 
affect the magnitude and location of 
errors in learning. Through the applica- 
tion of a phrase-structure analysis, John- 
son (1965) has achieved a remarkable 
degree of success in predicting the error fre- 
quencies in the learning of sentences as re- 
sponses in a PA task. A similar application 
might prove fruitful in the case of serial 
learning. 

Finally, since it has been demonstrated 
that the provision of a verbal organization 
containing all of the items in a serial list 
facilitates learning, it is of interest to de- 
termine the conditions under which posi- 
tive transfer would occur. One approach 
to this goal would involve the manipula- 
tion of both training and instructional 
variables relevant to the use and genera- 
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tion of verbal organization in the leam. 
ing of serial lists. The effectiveness of the 
manipulations could then be evaluated in 
terms of performance on a transfer task 
administered in accord with the usual 
method of serial anticipation. 
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The present study investigated the effects of multiple- vs. single-con- 
cept training orthogonally crossed with fixed, blocked, and scrambled 


solution tules. Otherwise the manipulations of solution rules did not 
differentially i 


training resulted in greater proficiency than single-concept training 


after the first anagram within 


transition to new sets of anagrams. However, 
resulted in greater proficiency once 


One facet of transfer that facilitates the 
rapid solution of problems is the acquisition 
of learning sets or “learning to learn” 
(LTL). Simply defined this means that 
when individuals learn they learn not only 
the specific concept, skill or discrimination 
demanded by the situation but they also 
learn something about how to form a con- 
cept, to develop a skill, to make precise 
discriminations, or to solve problems. 
technical definition of LTL suggests that 
the improvement of performance js due to 
the similarity in the general processes 
fundamental to the learning of several suc- 
cessive tasks which are all representatives 
of a given class but which have n0 other 
systematic similarity among them (Under- 
wood, 1966, p. 510). Harlow (1949), in his 
early study on learning sets with animal 
subjects (8s), called attention to the im- 
cones of the principle of practice for 

is aspect of human learning. Through 
ae, practice trials on related problems 
the individual becomes proficient in solving 
new problems within a given class due to 
aaa transfer of learning to learn. 
‘ie ecause of its pervasive nature LTL is 
a of the important outcomes of the 
ah ay process. Unfortunately, it may 
Pe e@ one of the most neglected learn- 
(Bl outcomes in compensatory education 
in oom, Davis, & Hess, 1965) 2s well as 
Sie formal school settings despite the 
i ny innovations in content made through 

(less curriculum reform. 

analysis of the practice conditions for 


the transition was made. 


LIL suggests variables that ultimately 
may prove to be of importance for teaching 
methodology (see also Di Vesta & Walls, 
1967a, 1967b, 1967c). As Adams (1954, p. 
15) indicates, Harlow’s method of training 
on a large number of problems may be only 
one of several possible training techniques 
that might be used. Thus, the successive 
tasks in initial training might be repeated 
presentations of the same problem, or they 
might be presentations of multiple prob- 
lems. When working with concepts this 
implies a comparison of solution times for 
problems in which the same concept is 
employed with solution times for problems 
in which multiple concepts are employed. 
Furthermore, practice on problems blocked 
according to concept could lead to greater 
facility in the solution of new problems 
than practice with a yariety of concepts 
varied unsystematically. Finally, 
method of solution might be fixed for all 
problems; different for each set of problems 
but the same for trials within problem; 
or different for all trials within & given 
problem. 

The present study investigated the effects 
of certain of these variables on problem 
solving. In particular, this study evaluates 
the relative effectiveness of single-concept 
and multiple-concept training; and of fixed, 
blocked, and scrambled solution rules, dur- 
ing training, 00 time required for solution 
of successive sets of anagrams comprising 
the transfer task. The use of anagrams 
permitted adherence to the characteristics 


igi 
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of LTL, as described by Underwood (1966, 
p. 510) that (a) successive sets of problems 
represent samples of items of the same 
class of materials, and (b) no systematic 
similarity exists between the successive sets 
of problems other than the commonality 
which allows them to be said to be of the 
same class of materials. 


Mertuop 


Design 


One part of the design consisted of two varia- 
tions in stimulus conditions operationally defined 
as the concept class. In the same-concept (SC) 
condition all anagrams in the training phase be- 
longed to the same concept class. In the multiple- 
concept (MC) condition, Ss were trained on sev- 
eral concept classes. These stimulus conditions were 
crossed orthogonally with three variations in 
response (responses were defined as solution rules) 
conditions. In the fixed-response (FR) treatment 
the solution was fixed, that is, the same solution 
rule could be used to unscramble any anagram in 
the entire training series. In the blocked-response 
(BR) condition the solutions were grouped so 
that the same solution rule could be used within 
a given problem but the solution rules differed 
among problems. The same solution rules used 
in the BR condition were also used in the mixed- 
response (MR) condition. However, in the MR 
condition the solution rules were assigned at 
random to the anagrams. 

Within these primary cells of the design there 
were five sets (blocks) of anagram problems. 
Within each block there were seven anagrams 
(trials). This organization of five blocks of prob- 
lems with seven trials in each block was employed 
in both the training and transfer series. Thus, 
each of the six groups of Ss were required to 
solve a total of 70 anagrams, 

Analyses were made of improvement in per- 
formance over blocks of trials and of improvement 
in performance as a consequence of position within 
blocks, For these purposes, the 2 X 3 X 5 X 7 de- 
sign was analyzed by Lindquist’s (1953) extended 
et VI as ete) and two within dimen- 
sions) mixed analysis of variance, separ: 
the training and for the transfer series, ag 


Subjects 


JA total of 90 Ss, 15 in each of the si = 
ditions, participated in the study. The Ss om 
undergraduate educational psychology students at 
Pennsylvania State University. There were 25 
males and 65 females in the total group. The Ss 
were assigned randomly to the six conditions with 
the restriction that these assignments be balanced 
over all cells. There were no restrictions placed on 
the selection of Ss, many of whom had partici- 
pated in previous verbal learning experiments. 


A 
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However, none had more than casual experience 
with anagram problems. 


Materials 


The materials for the SC conditions consisted 
of 35 five-letter anagrams of words from one con. 


cept class, that of articles of personal wear such | 


as gloves, scarf, watch, and smock. The materials 
for the MC conditions were 35 five-letter anagrams 
of words from the five categories of musical in. 
struments (viola, cello, flute, etc.), flowers (lilac, 
phlox, ete.), nationalities (Swede, Dutch, ete,), 
parts of the anatomy (ankle, chest, etc.), and 
whiteness (paper, chalk, etc.). 

All anagrams for the FR conditions could be 
solved by employing the solution rule of 5,2,1,4,3, 
The solution rules for the BR and MR conditions 
were 1,3,5,2,4; 4,3,2,1,5; 1,3,5,2,4; 3,5,2,4,1; 
and 3,4,5,1,2. These sets of solution rules were 
applied at random to the blocks of words when 
constructing the anagrams. As noted earlier the 
solution rules for the BR condition were varied 
so that one rule could be used to solve all 


anagrams within a block but each block was ; 


solved by a different rule. The five solution rules 
for the MR conditions were assigned randomly 
throughout the series of anagrams. . 

All groups solved the same transfer task. This 
task was blocked into five categories of seven 
anagrams in each block, as noted above. The 
categories were animals (horse, mouse, ete.), fruit 
(pears, apple, etc.), furniture (table, stove, etc.), 
colors (black, white, green, etc.), and meat (bacon, 


steak, etc.). A different solution was used for | 


each block of trials. Although none of the con- 
cept categories or solution rules were the same as 
those used in the learning task, the transfer task 
was essentially a continuation of the MC-BR 
(multiple-concept and blocked-response) condi- 
tion in the initial learning task. 

The length of all words was five letters and all 
words were nouns. The categories and total lists 
were also matched as closely as possible on oe 
frequency counts (Mayzner & Tresselt, 1965). 
order to counteract the possibility that some 0 
the anagrams were easier to solve than were other 
three lists were formed for each condition by 
three random assignments of the selected yar 
within each block. This process was followed fo 
all of the learning and transfer task lists. ‘a 

The anagrams were typed in capital letien 
the center of white unlined 3 X 5 inch index cares 
The cards were arranged in decks. 


Procedure 


The S was seated opposite the experine 
(Z) at a small table. After brief introdu 


remarks E instructed S by instructions adapte 
from Ronning (1965) as follows: 


r . t 

We are interested in finding the typical va 

formance on several word problems. I we ie 
to try to do your very best and to 0 


Formation or Lzarnrne Sets 


problems as rapidly as possible. We will be 
working with what is called the anagram prob- 
Jem, and with five-letter anagrams in particu- 
is simply five 


The 8 then solved five practice anagrams each 
with different solution rules. Any questions were 
answered by Z. The solutions were timed to adapt 
§ to the sound of the click of the stopwatch but 
the time was not recorded on the practice prob- 
lems. 

Tn order to compensate further for any differ- 
ences in difficulty among different parts of a given 
list, each S was started at different points on both 
the training and transfer lists. For this purpose, 
one of the five blocks was selected by consult- 
ing a table of random digits. Each S§ then solved 
the 70 anagrams required for the treatment to 
which he had been assigned. He reported his 
answer (the noun) for each anagram and then 


slowly spelled it, touching each letter with the 


ss 


eraser of his pencil as he did so. The instructions 
made no reference to the solution rules. If 8 found 
an “incorrect” word solution, he was told to “find 
another word,” and the time was reset to zero. A 
correction procedure was employed in those cases 
where § was unable to find the solution to any 
anagram within 60 seconds. 

_ For experimental and analytical purposes each 
list in both the training and transfer series was 
viewed as consisting of five blocks of trials with 
seven anagrams (trials) in each block. It should 
be noted, however, that S was presented the 
materials for a given condition as a single con- 
tinuous series of tasks proceeding, without a break 
or additional instructions, from the initial anagram 
in the training series to the final anagram in the 
transfer series. 


ReEsvuLts 


_ Performance in the learning phase and 
in the transfer phase was measured by time 
to solution, in seconds, for each anagram. 
The data were analyzed, by a mixed 
ee of variance with two between and 
Wo within dimensions, separately for the 
two phases, 
a. analysis of the data for the training 
“ ‘ase yielded significant main effects due 
do iations in the solution rules (F = 
mn df = 2/84, p < .05), to blocks of 
lals (F = 6.67, df = 4/336, p < .01), and 
( = eps position within blocks of trials 
ae dj = 6/504, p < .01). In addi- 
ot effect due to the interaction be- 
the ‘a anagram position and variation In 
ia Seas conditions was significant (F = 
76, df = 6/504, p < .01). None of the 
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other main effects or interactions was sig- 
nificant (p > .05). 

The mean number of seconds to solve 
the anagrams by Ss in the BR, FR, and 
MR conditions were 23.13, 24.54, and 29.16 
seconds, respectively. The average time re- 
quired to solve anagrams in each of the 
seven positions by Ss in the MC condition 
compared to the average time required by 
Ss trained in the SC condition are displayed 
in Figure 1. (Note that the means in that 
display are based on data for a given 
position summed over all training- or trans- 
fer-phase trials.) Although there is an over- 
all decrease in solution time (see solid line 
in Figure 1), it can be seen that single- 
and multiple-concept training apparently 
result in different strategies for the solution 
of anagrams. As might be expected, multi- 
ple-concept training results in longer time 
to solve the anagram in the first position. 
Once solved, and the concept is identified, 
the remainder of the anagrams in that con- 
cept category are solved quickly. The in- 
crease in time to solution from Position 5 
to Position 7 by the MC group is un- 
doubtedly due to the fact that anagrams are 
blocked according to concept; that is, Ss 
eventually learn that there are limited 
numbers of instances for each concept but 
have not in the course of the training phase 
discovered that it is seven. Presumably, 
in anticipation of a change in concept Ss 
in the MC group take slightly longer time 
to solve the last two anagrams in a series 
than for the anagram in the fifth position. 

The analysis of variance for the transfer 
phase data yielded significant main effects 
due to anagram position within blocks (F = 
34.22, df = 6/504,p < 001), to blocks of tri- 
als (F = 3.51, df = 4/336, P < .01), and a 
significant interaction between blocks of tri- 
als and kind (SC versus MC) of initial train- 
ing (F = 3.45, df = 4/336, p < 01). None of 
the other main effects or interactions was 
significant (p > .05). The mean perform- 
ance of all Ss on the anagrams 10 each 
position within the concept series in the 
transfer phase is displayed in Figure 1. The 
respective mean performances for the MC 
and SC training groups on this aspect of 
the transfer phase are only decimals apart 
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MEAN TIME TO SOLUTION (SECS) 


[eres a. @ 8°67 


15 
ae aS 4 6 OF 
POSITION OF ANAGRAM WITHIN BLOCKS 
Fig. 1. Mean time to solve an anagram as a function of its position within blocks of 
seven trials during the training and transfer tasks. 
MC group on the training task although 
it is less pronounced in the transfer task. 
The data for the significant interaction 
| 


and were not plotted. Consequently, it is 
interesting to note that the slight increase 
in time taken on the anagram in fifth 
position of the transfer series is character- 
istic of both groups. This effect appears to 
be similar to the increase found for the 


“MEAN TIME TO SOLUTION (SECS) 


15 


Wien rannens RS Sergey 
BLOCKS OF SEVEN TRIALS 
Fic. 2, Anagram solution times for five suc- 


carithy blocks of trials in the training and transfer 
tasks. 
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between kind of concept training and blocks 
of trials are depicted, graphically, in Figure 
2. As can be seen in that display there 184 
pronounced improvement in overall pet 
formance from the initial training trial to 
the final test block of trials. The differences 
between the SC and MC training on 
were not significant (p > .05) om ‘i 
training task but are presented here 4 
permit comparisons by the reader. Tt 0a 
be seen that the performance of the a 
training group is slightly superior ries 
but the fourth block of trials to that e 
SC training group during the training Wt | 
This superiority is maintained on the i. 
block of trials in the transfer task, aa 
might expect, since Ss in the SC group 
been trained on single concepts. On tl aie 
maining blocks of trials the perio 
of Ss in the SC group is equal or suP 
to that of Ss in the MC group. of the 
The differences in the performance task 
SC and MC groups on the transfer 


process | 


suggest that quite different LTL 


Formation of Learnine Sets 


are being learned. While it would appear 
that MC training should lead to more posi- 
tive transfer than SC training this does not 
appear to be true for the specific conditions 
of this experiment. It seems quite likely 
that in the present MC training conditions 
8s learn situation-specific rather than task- 
related problem-solving strategies. Thus, 
for example, the MC training group might 
have learned strategies related to the ar- 
rangement of concepts in blocks of seven 
yather than of the more durable problem- 
solving strategies of attempting alternate 
moves, identifying syllables, searching for 
poe of bigrams and trigrams, and the 
ike, 


Discussion 


In this study, the effects of anagram 
training, with different conditions of con- 
cept categorizations and solution rules, on 
transfer to the solution of new anagrams 
vere investigated. Manipulation of solution 
tules was found to affect performance dur- 
ing training only and did not differentially 
affect performance in the transfer task. 
The significant difference obtained during 
training simply reflects the greater diffi- 


culty in solving anagrams with inconsistent 


; 


solution rules as opposed to consistent solu- 
tion rules whether constant throughout all 
problems or whether blocked according to 
concept categories. 

The principal effects on learning and 
transfer were due to manipulations of con- 
cept categories. MC and SC training had 
differential effects, during the training 
Phase, on the solution of anagrams accord- 
He to their positions within blocks of 
a Thus, during training the MC groups 

ok longer to solve the first anagram 
Within a block but solved the remainder 
re rapidly than did the SC group, 

ereby providing overall superiority to the 
tak group as measured by average time 
bs €n to solve all anagrams. However, since 

a effect is to be found in the transfer 
Phase for all groups, without differentiation 
hone groups, it must be concluded that 
ng Shean effect is due to the manner 1n 
My ich the concepts were blocked. The adop- 

on of different strategies for solving prob- 
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lems is an important outcome of the two 
training procedures. The MC training en- 
courages the identification of the concept 
class on the first trial which then leads 
to the easy identification of associates of 
that class for the remaining anagrams 
within the class. SC training provides less 
practice on the “win-stay, lose-shift” strat- 
egy, learned in MC training, but provides 
more practice in identifying words belong- 
ing to a concept class. 

The performance over all blocks of trials 
by the MC and SC training conditions 
compares favorably with classical demon- 
strations of learning to learn. However, the 
two types of training differentially affected 
transition to the transfer task. Thus, if Ss 
were trained on a single category of ana- 
grams, a “set” was established that inter- 
fered with the immediate solution of a new 
class of problems. Nevertheless, after the 
first block of trials on the transfer task 
their performance was equal or superior to 
that of Ss trained on MC classes. As indi- 
cated by Johnson (1966) , 


The development of a category set resembles the 
conventional concept learning experiment in that 
there are abstract similarities and superficial differ- 
ences between successive problems and that, as 
practice continues, the similarities are more readily 
perceived and a common pattern of response fol- 
lows [p. 378]. 


The present results are similar to those 
found by Adams (1954) who employed a 
much simpler training procedure than used 
in the present study. The advantage of MC 
training appears to be its effects on the ease 
of transition to a new problem. As with the 
comparable group in Adams’ (1954) study, 
the performance of the SC group equaled or 
surpassed that of the MC group once the 
transition to multiple categories was made. 

In conclusion, the trend of decreasing 
solution times over all blocks of trials sug- 
gests that increased proficiency in solving 
anagrams under the conditions of this in- 
vestigation can be attained after experience 
with a sufficient number of training prob- 
lems. Since all stimuli differ on bases other 
than concept classes, the effects are assumed 
to be due to the nonspecific transfer of 
learning how to learn. 
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SOME UNPREDICTED EFFECTS OF DIFFERENT QUESTIONS 
UPON LEARNING FROM CONNECTED DISCOURSE 


LAWRENCE T, FRASE* 
University of Massachusetts 


It was predicted that a general orienting question would require that 
Ss process more information from a 36-word passage than would less 
general questions. Validity data on the question categories (60 college 
Ss) confirmed that the number of words in the passage which Ss 
thought were necessary to answer the questions increased from spe- 
cific to general. In spite of pretraining, Ss given the general ques- 
tions ignored words from the stimulus portion of the S-R_ pairs 
presented in the paragraph. Neglecting critical stimuli was considered 
to be an information rejection strategy which accounted for the re- 
sults of another study, in which 84 Ss were instructed to use the 
questions as aids in learning the passage. Contrary to predictions, 
retention was lowest with general questions. Learning from con- 
nected discourse is interpreted as a multistage process which re- 


quires precise orienting instructions. 


Anderson (1967) seems to have put his 
finger on a critical variable for controlling 
learning behaviors. He states that “...the 
most compelling stimulus in a frame is the 
question which must be answered or the 
blank which must. be completed [p. 137].” A 
basic problem involved in the effective con- 
trol of learning behaviors thus may not be 
whether the material is broken into small 
steps or physically separated stimulus and 
tesponse terms, but whether the method of 
instructional control (be it a program 
frame, a question, or a graph) gets the 
_ student to practice the stimuli and re- 
_ Sponses, and to make the appropriate as- 
_ Sociations between the two. Learning from 
connected discourse, according to this 
ae would be similar to paired-associate 
farning in that more than one stage may 
ty eonined to achieve mastery of the 
i950) (Underwood, Runquist, & Schulz, 
“ae The point here is that there are 
(8) an alternative ways of getting subjects 
boat © go through the behaviors in- 

of ed in these separate stages, and the 
; ae questions is one of these ways. 

D ces “ investigation of the instructional 
bs cts of questions is by no means @ new 
henge area (Distad, 1927; Holmes, 
ae yet very little precise information is 

ilable to tell us how questions work. In 
mere 


1 
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one study, Hershberger and Terry (1965) 
found that a confirmation procedure was 
least effective in a programmed learning 
task. The Ss who were given the correct 
answers (confirmation) presumably did 
not read the stimulus material carefully. 
These authors concluded that question 
difficulty (availability of the correct re- 
sponse) was an important determinant of 
learning. Rothkopf (1965) also found that 
if the correct response is easy to predict 
less will be learned. The most difficult ques- 
tions evidently require Ss to process ‘the 
words to which they are exposed since 
they learn more. The basic problem, how- 
ever, is to determine what specific stimulus 
controls cause Ss to retain more when diffi- 
cult questions are asked. 

‘A study by Mechanic (1962), sheds some 
light on this problem. Using a paired- 
associate task, he found that the nature of 
the responses required by an orienting task 
(which related to different cues in the 
stimulus lists) was of crucial importance 
for learning. If the cues used in the 
orienting task (which might be questions) 
are relevant to the experimenter’s (E’s) 
criterion, then Ss will score relatively high. 
Faust and Anderson (1967) found that 
making a program frame more difficult by 
adding irrelevant stimuli led to better re- 
tention because Ss had to at least notice 
the relevant stimuli. Another way of stat- 
ing this would be that Ss were forced to 
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respond discriminately to the stimuli when 
irrelevant components were added. 

Questions can be used effectively be- 
cause they function to control Ss’ attention. 
Berlyne (1965) maintains that attention is 
a negative process—it consists of informa- 
tion rejection. A precise question, such as 
asking for the name of an author, date of 
birth, etc., might allow S to ignore all but 
one sentence of a reading passage. If the 
question were more general, for instance, 
asking which of two authors was born 
earlier, § would have to take more of the 
information into account in order to an- 
swer the question. Schroder, Driver, and 
Streufert (1967) also emphasize the view 
that information rejection is a concomi- 
tant of attention. They point out that Ss 
“filter” inputs (by rejecting certain in- 
formation), and that more filtering will 
occur as the information load is increased. 
Information load may increase to a point 
at which Ss will abbreviate the task at 
hand, adopting what may seem to be an 
optimal strategy. Under high load, for in- 
stance in a lengthy connected discourse 
task in which reading behaviors are not 
precisely controlled, overall retention of 
the passage would decrease if Ss adopted a 
strategy which omitted some necessary 
step, such as practicing stimuli, practicing 
responses, or associating the two. 

The problem explored in the present 
study was to determine what happens to 
retention of a passage when an orienting 
question is asked which requires processing 
a relatively large or small amount of the 
total information in that passage. Study 1 
was conducted to determine, in a simple 
way, the validity of the question cate- 
gories—whether Ss agreed with E that 
more words are necessary to answer a 
general question as opposed to a specific 
question. 

The basic hypothesis of this experiment, 
explored in Study 2, was that a general 
question, with a large number of as- 
sociates within the reading passage, would 
require processing more information than 
a precise question, hence retention would 
be higher when Ss received a general ques- 
tion before reading the passage. This hy- 
pothesis is consistent with the view that 
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attention, under the control of certain 
questions, consists of rejecting irrelevant 
information. 

To anticipate the results, the data of 
Study 2 were statistically significant—n 
the direction opposite to that predicted, 
The initial validity data from Study 1 
showed that there was a plausible and in- 
teresting reason why this should have ov- 
curred. 


Srupy 1 


Method 


Subjects. Sixty undergraduate educational psy- 
chology students participated in this study as 4 
laboratory exercise” The Ss were randomly as 
signed to the three experimental groups. 

‘Materials. A very simple, highly structured, 
paragraph of 36 words was constructed which 
described two attributes about each of four indi- 
viduals. The paragraph follows. 


Jim is a pilot. He was born in 1921, John 
is a policeman. He was born in 1930, Jack is 
a butcher. He was born in 1926. Jeff is am 
engineer. He was born in 1934. 


In addition, three sets of two questions were ¢col- 
structed which E labeled specific (8), comparative 
(C), and general (G). The specific questions were: 


$1. When was Jack born? 
$2. What does Jack do? 


The comparative questions were: 
C1. Is Jim older than Jack? 


C2. Who has the more highly 
Jeff or Jack? 


skilled job, 


The general questions were: 


Gi. When were the men in the paragraph 
born? i h 
G2. What jobs do the men in the paragraP 
hold? 


Hence, there was a specific, comparative, aa 
general question related to age and occupation. 
Procedure and design. In order to dete ae 
the validity of the question categories, 8s ph 
asked to underline all the words in the paragr?) 
which would comprise a comple 
the information needed to answer one 0 
questions. An instruction sheet was co? ah gave 
which described this task to S and whl 
him an example and three practice pro! 8 
knowledge of results) including all thi 
—_— «ation t0 
2'The author expresses special appreciato 
Harry Schumer for providing the subject 
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of questions. The Ss were told that there was no 
time limit on this task. After reading the instruc- 
tions 8 turned the page and found one of the six 
questions, below which was the paragraph. He then 
proceeded to underline the words in the paragraph. 

The Ss from two laboratory sections were 
randomly assigned to one of six groups, 10 Ss 
received Question 1, 10 Ss received Question 2, 
etc. for a total of 60 Ss. The hypothesis was that 
Ss would underline most words with the general 


question, fewer with the comparative question, 
and least with the specific question. A simple 
one-way analysis of variance was planned. 


Results 


Table 1 presents the data which cor- 
roborate H’s characterization of the ques- 
tions as specific, comparative, and general. 
The comparative and general questions 
were, according to Ss, associated with 
more words than the specific question. The 
increase in words from specific to general 
questions was significant whether or not 
connectives were included in the word 
count. There was no variance in the num- 
ber of words underlined for specific and 
comparative questions, but there was for 
the general questions, hence the Kruskal- 
Wallis analysis of variance was used. 
There was very little difference between 
the two questions within each question 
category and the two questions (Si and 
82; Cl and C2; G1 and G2) were com- 
bined in this and later analyses. 

The variability of Ss’ scores with the 
general question was a troublesome finding. 
Obviously, the task was extremely simple, 
but for some reason Ss in the general ques- 
tion group seemed to ignore the instruc- 
tions and practice problems they had been 
given, Assuming that Ss did what they 
were instructed to, they should have under- 
lined one sentence if they had received a 
Specific question, two sentences if they had 
a comparative question, and four sentences 
. they had a general question. To explore 
his idea further, the number of words 
which were not underlined when they 
should have been (extrusions) was tabu- 
lated for each question category. The data 
on extrusions in Table 1 suggest that the 
Seneral question group for some reason 
adopted the strategy of throwing out 
words—presumably a form of information 
Tejection, 
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TABLE 1 
Anatysis oF Worps UNDERLINED WHICH 
Wovtp Comprise AN ANSWER TO THE 
QUESTIONS 


Specific |Comparative| General 
Words i See ee) 


Médn \Range| Mdn |Range| Mdn |Range| 


Underlined 4.5 | 1.0 | 8.9 | 2.0 | 18.0 | 16.0 | 30.8** 


Extrusions Whee! [ier pa Nite 1.4] 17.0] 7.7 
Note.—N = 20. 
® Kruskal-Wallis analysis of variance (corrected for ties) was 
$< .05. 
**p <..001. 


It seemed fruitful to ask whether or not 
Ss were rejecting terms in any systematic 
manner, The data revealed two things. 
First, extrusions were confined entirely to 
the general question group, and second, 
within that group the extrusions were con- 
fined entirely to stimulus terms and con- 
nectives. Response terms (always the pred- 
icate of the sentence) are conceived as 
those terms which directly answer the 
questions. About 27% of Ss in the general 
question groups, instead of underlining the 
four sentences which were required simply 
underlined, “pilot,” “policeman,” “butcher,” 
and “engineer,” or the appropriate date. If 
Ss adopted this strategy ‘when required to 
learn the paragraph it seemed doubtful 
that they would make the appropriate 
stimulus-response associations. It was ex- 
pected that Ss might disregard connectives, 
but not the names of the men in the para- 
graph. 


Srupy 2 


Method 


Subjects. Eighty-four Ss from three laboratory 
sections of educational psychology participated 
as a laboratory exercise. The Ss were different 
from those who participated in Study 1. 

Materials and procedure. The same questions 
and 36-word paragraph used in Study 1 were 
used in this study. ‘The following instructions 
were handed out to Ss. 


rovided with a 0 h 
hid you in getting relevant information from 


the paragraph. 
You will be 
paragraph. 


allowed 20 seconds to read the 
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When the experimenter says “Ready-turn!” 
turn over the page. You will see the question 
and paragraph. The experimenter will then say 
“Ready-turn!” again. At that time turn over 
the paragraph and DO NOT LOOK BACK. 


The question, which was followed by the para- 
graph directly below it, was on a separate sheet 
of paper following the instruction page. After 
Ss completed the reading task all papers were 
collected and the retention test was distributed. 

‘A five-alternative multiple-choice test of nine 
items was administered which provided, as ques- 
tion stems, the names of the men in the para- 
graph. The Ss had to select the correct date of 
birth and occupation for each name. There was 
one additional item which required Ss to rank 
order men in terms of date of birth. In short, 
there was a test item for every sentence in the 
paragraph. 

Design. The Ss were randomly assigned to one 
of the three question treatments. There were 28 
Ss in each of the treatments (specific, comparative, 
or general question), Half of the Ss in each of 
these three groups received the age-related ques- 
tion, the other half received the occupation- 
related question. 

One hypothesis was that more Ss in the specific 
question group would pass the one age or occupa- 
tion test item which was relevant to their ques- 
tion. A chi-square test with a pass-fail criterion 
was planned to test this hypothesis concerning 
specific retention. 

_ Another hypothesis was that the general ques- 
tion group would score highest over the entire 
test. A one-way analysis of variance was planned 
reuaen this hypothesis concerning generality of 

ion, 


Results 


The x* test was significant at the .01 
level (x? = 10.8, df = 2). Of the 28 Ss in 
each group the percentage passing the 
one age or occupation item appropriate to 
the specific question group was 82% (spe- 
cific), 61% (comparative), and 39% (gen- 
eral), The most precise question led to the 
pr a acquisition of the specific 

‘imulus- mse associati i 
eh hypotets, cee 

other hypothesis to be investi 
was whether general questions Sha 
to higher overall scores—generality of 
learning. The means for the specific, com- 
parative, and general groups were 4.75, 
4.25, and 3.25, respectively. For these data, 
F = 435, df = 2/81, p < .05. Duncan’, 
multiple-range test indicated that only the 
specific and general group means differ at 
the .05 level. In short, the general question 
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group scored lowest whether the criterion 
of performance was a specific question. 
relevant test item or the total retention 
test. 


Discussion 


The results of both studies taken together 
suggest that questions can have subtle ef- 
fects upon Ss performance which may not 
be anticipated by the unwary instructor, 
Questions may work, in the sense that 
they cause Ss to pay close attention to the 
passage, but the phrasing of the questions 
might select out only a portion of the neces: 
sary stimuli. In the present study the ques- 
tions were designed to get Ss to respond 
to a relatively small or large amount of the 
material, but the general questions did not 
work that way. Instead, several Ss concen- 
trated upon only the response terms, This 
finding may have implications for the in- 
adequacy of Ss’ underlining or note-taking 
skills, On the assumption that learning 
from connected discourse involves both 8 
response learning and associative phase 
(Underwood, Runquist, & Schulz, 1959), 
it is clear that questions which are to be 
used as instructional aids must be phrased 
in such a way that Ss are directed to re 
hearse the stimuli, the responses, and als0 
the associations between the two. In con 
firmation of the Faust and Anderson 
(1967) finding, an efficient frame (or ques 
tion) must insure that Ss practice more 
than just the response terms, To rewol 
Anderson’s (1967) maxim, the most cody 
pelling stimulus in a paragraph is the 
word or set of words which directly al 
swers the question or fills in the blank. 
The precise responses required by we 
orienting task (questions), in confirmation 
of Mechanic’s (1962) findings, seem to have 
been critical for learning in this study. 
The Ss responses were more precise an 
E’s skill at making up prequestions. ne 

The data on extrusions conur’ 
process of selective information reject 
(attention) which Berlyne (1965) é) 
Schroder, Driver, and Streufert (19 
have suggested. The general qu! is 
used in the present studies evidently ai 
sent a case of maximal information £, 
or uncertainty. All Ss received the § 
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paragraph, hence the amount of nominal 
uncertainty or information in the para- 
graphs was held constant, but the ques- 
tions functioned to systematically induce 
more or less effective uncertainty into the 
learning task. With higher uncertainty 
(the comparative and general questions) 
the subtle inadequacy of the questions (in 
terms of the criterion retention test) be- 
came more critical, and Ss adopted their 
own strategies of learning. The general 
conclusion seems to be that, as effective 
uncertainty or information load increases, 
precise control over reading behavior be- 
comes more imperative. In a program 
frame information load is limited and hence 
precision of control is obtained by the 
format of presentation. 

At the other extreme, learning from 
continuous discourse material (a high un- 
certainty condition) shifts the burden of 
control from the format of presentation to 
the orienting task. The orienting task, 
whether a question, a graph, or combina- 
tions of various aids, must insure that 
Ss execute all the responses necessary for 
successful performance of the criterion 
task. This includes rehearsing the stimu- 
lus, rehearsing the response, and putting 
the two together. Ultimately such precise 
control reduces to a programmed learning 
task, except that it retains the advantage 
of keeping the learning material together 
in one place. Presumably, there are ad- 
Vantages in contiguous presentation of a 
topic (Ausubel, 1963). As this simple ex- 
periment has shown, problems of effective 
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stimulus control become critical under free 
response learning conditions. 
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The present research was 


designed to contrast the psychological 


processes underlying perception of pictorial and verbal stimuli. 


72, students learned @ 12-pair 


set of either English word-Japanese 


word or simple picture-Japanese word equivalents which were either 


conceptu: 
findings were 


objects did not function as equiv: 


language equivalents. Generally, 
equivalent pairs, especially when 


stractions were conceptually similar 


, similar-isolated, or similar-grouped. The 
that pictures and words representing the same familiar 
alent stimuli in learning the set of 
pictures facilitated the learning of 
the objects represented by the ab- 


(p < .005). Further, grouping the 


abstractions representing conceptually similar objects increased the 


rate of acquisition. 


Whenever the conditions for learning a 
semantic relation are arranged, a decision 
js made as to how the referent shall be 
presented. Although the choice may be 
made between concrete and abstract forms 
(i,e., between the events and their abstrac- 
tions), within educational settings the choice 
more frequently is made between types of 
abstraction (e.g., between words and pic- 
tures). 

In foreign language learning it has long 
been considered expedient, if not optimum, 
to learn the meaning of words through 
explicitly relating new words to old. Thus, 
vocabulary learning in a foreign language 
has often occurred by pairing a word or 
phrase from the to-be-learned language 
with its equivalent within the natural 
language under conditions where the learner 
understands that the foreign word “means 
the same” as its natural language equiv- 
alent. 

Current trends in the teaching of foreign 
languages do not favor the word-word 
equivalents approach to learning meaning 
however. Advocates of the “audio-lingual a 
“audio-visual,” or “conversational” ap- 
proach to foreign language learning (Rivers 
1964) contend that the learning of equiv- 


1The data on which this paper is b: 
included in the author’s dissertai 3 Ta ice 
partial fulfillment of the requirements for the 
be glee at ie. University of Minnesota. 
ral le is expre to Dani ; 
directed this study. Law ee 


alents “binds” the learner to his natural 
language so that utterances within the new 
language must be preceded by the equiv- 
alent within the natural language. Such a 
two-step process, it is claimed, necessarily 
impedes the development of facility within 
the new language. Recent theoretical for- 
mulations (Deese, 1964, for example) and 
empirical evidence (Jenkins, Neale, & 
Deno, 1967; Karwoski, 


Gramlich, & 


Arnott, 1944) suggest, however, that non-— 


linguistic events are processed linguisti- 
cally. If this is true, it may be virtually 
impossible to avoid the “two-step process 
with a native speaker. 

The present research was designed to 
contrast the psychological processes under- 
lying perception of pictorial and verbal 
stimuli, Underwood and _his associates 
(Underwood & Schultz, 1961; Wallace & 
Underwood, 1964) have clearly demon 
strated that the rate of learning 4 set 0 
word pairs is decreased when the members 
of the set are conceptually similar. Tbe 
strategy employed in the present stl! 
was to use sets of words and pictures i 
stimuli which represented the same comb: 
mon objects under conditions of both by : 
and low conceptual similarity. The ai 
diction was that an increased ificulty 
learning with a high similarity set of Wor” 
would be paralleled by an increase? sty 
culty in learning with ; 
set of pictures. This hypothesis ‘wal 
on the assumption that during le 


g bas 
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the pictures would be linguistically en- 
coded and processed in the same manner 


as the words. 


MerHop 


Subjects 

Thirty-six males and 36 females enrolled in 
introductory educational psychology classes at 
the University of Minnesota during the summer 
of 1965 served as subjects (Ss). The majority of 
the students were juniors enrolled in the College 
of Education. Each student participated in the 
experiment on a voluntary basis, but received 

course credit for his participation as an 8. 

Two restrictions were placed on the selection 
of Ss. First, no student could serve as an ex- 
perimental S if his native tongue was not the 
English language. Second, no student could serve 
as an § who had previously studied an Oriental 
language. The first restriction was applied to en- 
sure that all Ss would possess approximately the 
same natural language habits in relation to the 
stimuli employed, and the second restriction was 
applied to avoid the possibility that an S might 
have been familiar with the Japanese words which 
were learned as responses. 


Materials 


The Ss learned a set of equivalents consisting 
of 12 pairs, The stimuli were either pictures or 
words representing 12 common objects and the 
Tesponses were 12 Japanese words. 

Twenty-four different objects were represented 
either by verbal (one word) labels or by simple 
black and white lined drawings. Twelve of these 
Tepresentations comprised a conceptually dis- 
similar list, and 12 comprised a conceptually 
similar list. The conceptually similar lists con- 
tained three instances from each of four con- 
rental categories—animal, clothing, furniture, and 
Pa, The nature of the lists employed can best 
e understood by examining Table 1. 
oN shown in the table, the conceptually similar 
4 was arranged in two different ways. Either 

@ instances from a particular category were 
Presented in such a way as to be maximally 
oes from other instances within the same 
Rene during the presentation (List 2), or the 
4 gory instances always appeared in sequence 

luring the presentation (List 3). 
ees Japanese words used as responses Werte 
a from the list of responses used by Horowitz 
oa arsen (1963). Japanese words were used as 
pe ponses because their English transliterations 
irs read and pronounced by persons native 
at e English language. Despite this relative ease 
Hed perma ee however, Japanese words are 
a erived from a language related to English, 
pate consequently, appear very unfamiliar to 

meone who has not had previous experience 
with them. 
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TABLE 1 
REFERENT STIMULI AND JAPANESE WORD 
RESPONSES 
Stimuli (either word or picture) 
Similar- Similar. catia 
Dissimilar peed ecompedl 
WOMAN HAT HOUSE Atsui 
DOOR CHAIR SCHOOL Hune 
FISH cat CHURCH Riko 
KNIFE TABLE BED Kari 
NOSE CHURCH | CHAIR Baka 
STOOL DOG TABLE Amai 
BOAT SCHOOL COAT Hikui 
MOON TIE HAT Hayai 
STORE MOUSE TE Kuro 
APPLE COAT DOG Tako 
RADIO HOUSE MOUSE Chikai 
LADDER | BED caT Tooi 


The same 12 Japanese words were used as re- 
sponses regardless of the objects represented by 
the stimuli. Using the same response words allowed 
comparisons among stimulus lists which were not 
confounded by different rates of response acquisi- 
tion. The list of responses also appears in Table 1. 

Two different random pairings of stimuli and 
responses were used to reduce the possibility that 
a particular set of stimulus and response pairings 
was easier to learn. The only constraint placed 
upon the randomizations was that a response was 
not paired with the same stimulus in both ran- 
domizations. The responses were, of course, paired. 
with different stimuli in the cases of dissimilar and 
similar lists, but the responses were paired with 
the same stimuli for the two different arrange- 
ments of similar stimuli. 

Three different random orders of the lists were 
made to reduce the possibility of sequence effects 
in response learning. The randomizations of Lists 
1 and 2 were freely accomplished, but the related 
stimuli in List 3 had to be grouped so that ran- 
domizations for this list were composed by first 
randomly assigning the category positions within 
the list, and then randomly assigning the instances 
within each category. 


Apparatus 


graphed on 35-mm. 


negatives were 
sradiieed were white on & black background. The 
white on black contrast was selected to reduce 
are produced by rear projection. : 
Scned jmages were projected at approximately 
1 on a 15-inch X 30-inch rear projection 
screen by two carousel-type slide projectors. 
The slide projectors were connected to a timer 
which advanced the carousel and presented a new 
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controlled by a shutter in front of each projector 
lens. The stimulus slides were placed in one 
projector and the response slides in another. As 
goon as the slides had advanced, a shutter in front 
of the stimulus slide would raise to present the 
stimulus (word or picture). After 2 seconds the 
response slide (Japanese word) would be exposed. 
Then, both shutters would drop and the timer 
would advance the carousel so that a new stimulus 
and response pair were available for presentation, 
During the advance the screens were black for 
slightly Jess than 1 second, 


Procedure 


The Ss participated individually. Each S was 
randomly assigned to 1 of 6 treatments based on 
the order in which he reported to the laboratory. 
The only restraint on this random assignment 
was that an equal number of male and female Ss 
should participate in each of the treatments. The 
treatment groups comprised a 2 X 3 factorial 
with two types of stimuli and three arrangements 
of list similarity as the conditions. 

Each S was seated at a table before the rear 
projection screen and given instructions that he was 
going to learn the meanings of some Japanese 
words. It was explained to him that he was to learn 
these meanings under a paired-associate anticipa- 
tion method. First the event signified by the word 
would be projected on the screen and then a few 
moments later the Japanese word which was equiv- 
alent to that event would be flashed alongside of it. 
His task was to learn the meaning of each word so 
that on succeeding presentations of the abstraction 
he would be able to anicipate the Japanese word 
before it flashed upon the screen. The Ss were 
encouraged to guess if they were not sure. 

The three random orders of each list were then 
presented continuously to Ss, although there was 
a brief delay following Order 3 so that the carousel 
could be advanced to the beginning of Order 1 


TABLE 2 
Mmans, Stanparv Duyrations, AND Propapirry Vauuns For DiFFERENCES 
On Trrats To Crrrerion* AnD Error MzasurEs? 
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again. The Ss continued through the list until 
they correctly anticipated the Japanese word re. 
sponses on two consecutive trials, or until 26 
trials had occurred. 


Recording and Treatment of Data 


All responses were recorded by the expen- 
menter on a data collection form. Comparisons 
between groups were made in terms of number of 
trials to criterion (twice through the list without 
error) and error rates. Since different stimulus 
lists might require more trials for learning, and, 
therefore, increase the number of opportunities to | 
make errors, total errors was not used as a basis 
for comparison. Instead, the number of errors 
in 10 trials was used to compare error rates, The 
first 10 trials were selected because beyond that 
point many Ss attained criterion. 

Errors were classified as omissions (failures 
to respond), and intrusions (overt errors), and 
separate comparisons were made for both types as 
well as omissions and intrusions combined. 


Resvits 


The results of learning the equivalent 
word-word and picture-word pairs are 
summarized in Table 2. The error means) 
in the table are based upon responding 
during the first 10 trials. Six Ss in the 
similar-isolated words group failed to 
achieve the criterion within 26 trials an 
were given a trials-to-criterion score of 
(the fewest number of trials in which the 
Ss could have achieved the criterion). 

The results are reported in terms of a set 
of orthogonal comparisons which were of 
prior interest. i 


Stimulus Condition che rhe Omission errors Intrusion errors Combined errors 
TES a RAI Ese 
Dissimilar 
Fetus 43.7|20.8| — | 14.5| 10.0|—| 58.2| 181] <™ 
Similar-isolated 88.3 | 17.7 9.8| 9.5 48.2 | 20.2 
Vietores 71.0 | 13.0 | <.005 | 17.3 | 9.8 | — | 98.0 | 18.5] <™ 
Similar-grouped 47.8 | 20.5 19.7 | 17.4 67.5 | 18.9 
Petites eae oy. anal Bo |o [ool me 


* Twice through the list without errors. 


» Errors based on responding in the first 10 trials. 
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Dissimilar Stimuli 

The results of learning the equivalent 
pairs where the stimuli were conceptually 
digsimilar show that Ss learning with 
pictures performed better than Ss learning 
with words on every dependent variable. 
In no case, however, are the obtained dif- 
ferences between group means reliable. 


| Similar-Isolated Stimult 


Isolating similar stimuli in the sequence 
of presentation produced very large and 
significant advantages for learning with 


‘pictures rather than words as stimuli on 


all dependent variables except intrusion 


errors, (F < 1.00, df = 1/66). Whereas 


the mean number of trials to criterion in- 
creased from 14.2 with dissimilar pictures 
to only 15.9 with similar-isolated pictures, 
the mean number of trials with similar- 
isolated words as stimuli increased from 
15.2 to 23.8. The large difference in error 
rate between pictures and words which were 
similar and isolated was due to the dif- 
ference in errors of omission, (F = 11.01, 


df = 1/66, p < .005). Within the first 10 
‘trials Ss in the similar-isolated pictures 


‘condition actually made more intrusion 


errors than Ss in the similar-isolated words 
condition (words X = 17.3: pictures X 
19.6), although the difference was very 
small and not statistically significant 
(F < 1.00, dj = 1/66). In sum, Ss learning 
with words or pictures were about equally 
likely to make an overt response which 


| Was incorrect, but Ss learning with pictures 


were more likely to make overt responses 
which were correct than Ss learning with 
words. The overall effect was that those Ss 
learning with similar-isolated pictures 
emitted Japanese word responses at @ 
much higher rate during the first 10 trials 
than those S’s learning with similar-isolated 
words. This, in spite of the fact that all 
Ss had an equal opportunity to acquire and 
emit the responses during learning. 


Similar-Grouped Stimuli 


Grouping similar stimuli during the se- 
quence of presentation produced differences 
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between word and picture means which 
were much smaller than with similar-iso- 
lated stimuli. As measured by trials to 
criterion the picture-word difference for 
similar-grouped stimuli, although statis- 
tically significant (F = 5.88, df = 1/66, 
p < .025) was approximately one-half the 
difference obtained between pictures and 
words when similar stimuli were isolated 
during presentation. The difference between 
words and pictures which were similar but 
grouped during presentation was statisti- 
cally significant only for number of trials to 
criterion. None of the error rate measures 
yielded a mean difference which approached 
statistical significance (F < 1.00, df = 1/66 
in all cases). 


Discussion 


The results indicate that it makes little 
difference whether words or pictures are 
used to represent objects when learning a 
set of equivalent pairs. The conclusion 
holds, however, only if the objects repre- 
sented are conceptually dissimilar. When 
the events represented are conceptually 
related, pictures seem to be much more 
easily associated with foreign word re- 
sponses than words. The extent of advan- 
tage of pictures over words is different, 
however, depending upon the dependent 
variable considered, and whether or not 
conceptually related stimuli are presented 
in isolation or in groups during learning. 
In terms of trials to criterion, learning 
with conceptually related words is signifi- 
cantly more difficult than learning with 
conceptually related pictures regardless of 
stimulus arrangement. With respect to 
error measures, however, statistical sig- 
nificance depends upon the sequence in 
which conceptually related stimuli are pre- 
sented to the learner (i.e., whether isolated 
or grouped). : 

The findings with respect to grouping 
conceptually similar stimuli are consistent 
with the results of a study by Gagné (1950) 
in which grouping similar nonsense form 
stimuli facilitated learning. In the present 
study, grouping conceptually related stim- 
uli reduced learning difficulty to a point 
where a set of pairs could be acquired al- 
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most as well with similar stimuli as with 
dissimilar stimuli. : 

The obtained interaction between stim- 
ulus mode and stimulus similarity is par- 
ticularly interesting. If it is assumed that 
the increased difficulty in learning with con- 
ceptually related words occurs because of 
similarity in meaning (conceptual simi- 
larity), then it must be concluded that 
the pictures, although readily identified 
with the appropriate word labels, are not 
encoded in the same manner as the words, 
This conclusion derived from the differen- 
tial effects obtained during the learning of 
language equivalences (ie., in a paired- 
associate learning task) is consistent with 
evidence obtained by Deno, Johnson, and 
Jenkins (in press) in a study comparing 
the distributions of free associations to 
words and pictures. These investigators, 
using the same words and pictures as in 
the present study, found that mode of rep- 
resentation (word or picture) significantly 
altered the associative similarity between 
objects represented. 

Both the differential efficiency in learning 
with words and pictures and the apparent 
difference in psychological effect are seen 
as relevant for any one attempting to use 
these two different kinds of abstractions 
to communicate meaning. These results are 
particularly significant in foreign language 
learning, While learning the referent for a 
foreign word may be more efficient if the 
event is portrayed pictorially, there is the 
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danger that the meaning evoked by the 
picture may not be the same as that in. 
tended. It may be that communication by 
word is more reliable. 
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PART-TASK VERSUS WHOLE-TASK PROCEDURES FOR 
TEACHING A PROBLEM-SOLVING 
SKILL TO FIRST GRADERS’ 


RICHARD C. ANDERSON 
University of Illinois 


Groups of Ist graders were trained to solve concept-attainment prob- 
Jems by either a small-step, programmed part-task method or a whole- 
task method in which § attempted terminal problems early in training. 


Contrary to previous research, which had 


that whole- 


task methods are superior for highly organized tasks, in the present 
instance the part-task group performed better than the whole-task 


group on terminal training proble: 
sented again later to measure rete! 
ference between these groups on 


ms and on similar problems pre- 
ntion; however, there was no dif- 
transfer problems. Both training 


groups were superior to a no-treatment control group on all measures. 


Educators who have been influenced by 
the programed-instruction movement take 
it as self-evident that the best way to 
teach a complex skill is to analyze it into 
component subskills and subconcepts, then 
teach each of these in turn. Cast in dif- 
ferent language such an approach is a 
part-task method, to be contrasted with the 
whole-task method in which the student is 
required to perform the terminal behavior 
as best he can from the very beginning of 
training. Surprising as it may seem to 
those who have been influenced by the 
conceptions of programmed instruction, the 
research on complex skill training has fre- 
quently shown whole methods to be supe- 
nior to part methods. 

The terms “part” and “whole” will be 
used in this paper as shorthand words for 
talking about the issue of how lengthy and 
complicated a segment of a task the stu- 
dent should be required to attempt during 
Instruction, especially during the initial 
ad of instruction. Part methods result in 
a” initial error rates and fast progress, at 
least at the beginning of instruction, but. 
when account is taken of the time to com- 
es research described herein was completed 
Sees the Office of Education, United 
Welfare rs ment of Health, Education, and 
iinghars os author is indebted to Donald Cun- 
hngham, Diane “Fuglsang, ‘Marianne, ode 
sree eee hag for their assistance in this proJ- 
thelr Y ee is grateful for the cooperation of 
hois; Tolono rise er pam ire 
School, Uae ea eae eae 


bine the parts, typically the advantage for 
the part method has been negligible at 
best. In the case of rote materials, prac- 
tice on later parts (sublists) produces 
interference with earlier parts. This inter- 
ference must be overcome during the com- 
bination stage. With respect to complex 
skills and structured, meaningful material, 
there are coordinations and interrelation- 
ships among the subskills and subconcepts 
that cannot be acquired from training with 
the components alone. Herein lies one ap- 
parent reason that whole-task training has 
frequently proved superior to part training 
in the case of complex skills. 

Whether a procedure which emphasizes 
lengthy task segments will prove superior 
to a procedure that begins with short task 
segments will surely be heavily dependent 
upon the manner in which the training pro- 
cedures are developed. If entering behavior 
is underestimated and the steps in the part 
procedure are more finely granulated and 
more numerous than necessary, jt may be 
less efficient than a procedure in which 


ther, particularly when the task analysis 
underlying the part procedure is incom- 
plete, subjects (Ss) who receive 
or small-step procedure may fail to learn 
coordinations among component skills and 
concepts, whereas Ss trained from the be- 
ginning on larger segments of the task may 
induce these coordinations. 

There may also be characteristics of 
tasks which systematically interact with 
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the length and complexity of the instruc- 
tional units into which the task is di- 
vided. Naylor and Briggs (1963) have sug- 
gested that the relative effectiveness of 
part and whole methods is a function of 
the level of complexity and the degree of 
organization of the task. Task complexity 
is said to refer to “demands on informa- 
tion-processing and memory-storage ca- 
pacities” while task organization is said to 
refer to the nature and extent of the inter- 
relationships among task dimensions. As 
organization increases, whole methods are 
predicted to be increasingly superior to 
part methods. For a highly organized task, 
an increase in complexity (difficulty) is 
predicted to result in greater superiority 
for the whole method, The part method is 
predicted to be superior to the whole 
method only in the case in which the task 
is both complex and unorganized. Naylor 
and Briggs (1963) completed an experi- 
ment, using what may be called concept- 
learning tasks, in which it was found that 
the whole method was much better than 
the progressive-part method on a highly or- 
ganized task regardless of task complexity. 
On an unorganized task, the whole method 
was slightly better when task complexity 
was low and very slightly worse than the 
progressive-part method when task com- 
plexity was high. 

The experiment reported herein in- 
volved a comparison of a small-step, pro- 
gramed part-task method and a whole- 
task method for teaching children a 
complex problem solving skill, the skill of 
varying each factor in succession while 
holding all other factors constant, This 
skill is a classical strategy of experi- 
mental science and it is applicable to a 
large and important class of problems 
brie a Goodnow, & Austin, 1956, pp. 81 

A concrete example will serve to illus- 
trate the kind of problem pl a om 
child and the sort of behavior expected 
from him, The materials consist of eight 
cards upon which are pictured either one or 
two rectangles or diamonds which are ei- 
ther red or green. The following instruc- 
tions are read to the child: 
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I will think of a secret and point to one card 
that shows my secret. You pick cards to figure out 


my secret. Each time you pick a card I will tell 
you whether it shows the secret. As soon ag you | 


know, tell me the secret. 


There are 27 different conjunctive con. 
cepts which can be formed with the card 
array (and the other materials described 
later). There is 1 zero-dimensional con- 
cept (All the cards) and there are 6 one- 
dimensional concepts (for example, red or 


two), 12 two-dimensional concepts (for | 


example, green diamond), and 8 three- 
dimensional concepts (for example, one red 
rectangle). The problem might involve any 
one of the 27 possible concepts. The child 
does not know which stimuli are discrimi- 
native nor how many dimensions are rele- 
vant. He begins with only a positive or 
focus instance. It is not enough that the 
child says the right answer. To be counted 
as having solved the problem, the child 
must choose a set of cards and state a con- 
cept such that the set of cards logically im- 
plies the concept that he states and no 
other. 

Relative to the tasks employed by Nay- 
lor and Briggs (1963) the problem-solving 
task used in the present experiment would 
have to be regarded as highly organized. 
Furthermore, the task is very difficult 
(complex) for seven-year-olds (Anderson, 
1965), the age of the Ss in the present 
study. According to Inhelder and Piaget 
(1958, p. 335) people do not normally ac- 
quire the skill taught in the present ex- 
periment until 14-15 years of age. For 
these reasons, if Naylor and Briggs (1963) 
are correct, this should be a case in which 
the whole method is vastly better than the 
part method. 


Merxop 


Materials 


Three sets of materials were employed. The a 
array consisted of eight 2/2 X 3%4 inch cards tap 
in an orderly arrangement on a 16 X 18 im 
Masonite panel. The cards had figures ins¢t! 
upon them that varied with respect to num! " 
(one figure or two figures), color (red or aera! 
and form (rectangle or diamond). The st 
game consisted of eight toy plastic cowboys w 
accessories imbedded in an orderly arrangeme 


OO 
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9 X 13 inch plaster-of-paris base. The cow- 
e either standing or riding horses, either 
r without hats, and either with or without 
The final set of materials consisted of eight 
a hexagonal No.2 pencils. The pencils dis- 
three attributes: length (6 or 2 inches), 

or absence of eraser, and sharpened or 
pened. The pencils were arrayed in a dis- 
‘fashion on a table in front of S. While 
three sets of materials involved different 
lus dimensions and different “story lines” 
employed with them, an identical problem- 
g task could be created with each set of ma- 


ree programs, one for each of the three sets 
terials, were developed to teach children to 
ily the technique of varying each factor in suc- 
m while holding all others constant. These 
ams had their origins in a free wheeling, 
programmed training procedure which, 
eless, achieved considerable success with 
first graders (Anderson, 1965). A modifica- 
f this procedure, which more nearly resem- 
program, was first developed for the card ar- 
After a cycle of tryout and revision the 
m was used with 10 second graders from a 
I in a rural community. On the last 10 frames 
program, upon which terminal behavior was 
ed, seven children made no errors, one made 
€tror, one made two errors, and one made 
— The mean time for completion was 73 
8. 
he pencil program consisted of a literal trans- 
lon of the card array program. In each frame, 
il word or symbol was substituted for every 
array word or symbol. No other modifica- 
were made. Prior to the experiment, the 
1 program was run with 10 naive secon! 
m8 from a school in a rural community. The 
time to complete the program was 104 
ites. On the last 10 frames, six children made 
ors, three made one error, and one made 
trors. The cowboy program was also created 
Bliteral translation of the card array program; 
is the cowboy program was not used with 
iidren prior to the experiment. 
final form of the programs embodied an 
of the total problem-solving strategy into 
Major subskills. The first section of each 
Ned designed to teach appropriate con- 
‘drawing behavior. The Experimenter (EB) 
‘the y naming concepts while § pointed to all 
© positive instances of the concepts. Then the 
"hid reversed; E pointed to all of the posi- 
anid of concepts and S named the con- 
ee ext, Z pointed to sets of instances, some 
“sl some negative, in such a way that 
lefined a concept; § named the concepts. 
&xample, Z might point to a long sharpened 
oe no eraser and a short sharpened pen 
‘No eraser, indicating that each of these 
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“showed the secret,” then a long sharpened pencil 
with an eraser and a long unsharpened pencil 
without an eraser, indicating that the latter two 
did not “show the secret.” If the child responded 
“the secret is sharpened with no eraser,” he an- 
swered correctly. When § could correctly name 
seven out of eight consecutive concepts given a 
set of defining instances (a criterion he was re- 
quired to meet before proceeding), he was then 
judged to have acquired a satisfactory approxima- 
tion of the conclusion-drawing skill. 

The second component in the total problem- 
solving skill is the skill of selecting appropriate 
instances. To begin the section of the program 
teaching this skill, Z pointed to an instance. The 
8 was required to pick an instance which was dif- 
ferent from E’s instance in a specified way but the 
same in every other way. For example, § might 
be instructed to “pick a pencil just the same as 
mine except that it is a different length.” After 
several frames involving the stimulus dimensions 
of a task taken one at a time, 5 was then required 
to pick three instances, each of which differed in 
exactly one respect from the instance designated 
by EZ. When 8 reached a criterion of seven out of 
eight consecutive correct selections of sets of three 
instances, he had mastered the skill of selecting 
instances. 

The final section of each version of the part- 
task program taught the child to integrate the con- 
clusion-drawing skill with the instance-selection 
skill. In this section, the child selected a set of in- 
i child’s instance, Z defined a 
concept; finally, the child named the concept. 


corrected on the 20 terminal frames, each of which 
entailed a problem presented using the same pro- 
cedures as were used for test problems. 

-A standard correction procedure, not expressly 
described within the program, 
by E whenever S made an error. 
a new problem similar to the one upon which the 
error was made, told 8 the answer to 
problem, and then presented the original problem 
a second time. This procedure almost always 


prompted the correct response. 


for the 
cil col- 


ivalent in D n 
Gon of words and symbols would permit the lit- 


eral translation of one f 
first section of the whole-task program was iden- 
tical to the first section of the part-task program. 
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In this section EZ named concepts and § pointed 
to all of the positive instances of the concepts. 
Thereafter, S received terminal problems. The 2 
designated a focus instance. The S's task was to 
choose instances until he could name the concept. 
Whenever S chose an instance EH indicated 
whether the instance “showed the secret.” With 
one exception, the procedures for presenting prob- 
lems within the whole-training program were 
same as the procedures for administering test prob- 
lems to be described in the next section. The 
exception was that when S selected six instances 
during a training problem without solving the 
problem F told the child the correct concept. Feed- 
back of this sort was not given during the last 20 
training problems nor during test problems. 

The part-task and whole-task programs were 
equated in terms of the total number of task- 
relevant overt responses required under the as- 
sumption of error-free performance. This measure 
resembles measures such as number of trials that 
can be applied to simple tasks. For example, a 
child who behaves ideally on a terminal problem 
will select three instances and state a conclusion, 
a total of four distinguishable overt responses. 
Each of the versions of the part-task program re- 
quired a total of 360 task-relevant, overt responses 
whereas each of the versions of the whole-task 
program required 364 such responses. The first 28 
and the last 80 responses (20 terminal problems) 
were the same for both programs. In between, 
those who received the part-task program were led 
to make a progression of 252 responses designed to 
teach them @ conclusion-drawing skill, an instance- 
selection skill, and to integrate the two, as detailed 
earlier. The middle section of the whole-task pro- 
gram, on the other hand, contained 64 terminal 
problems which could have been solved with 256 
overt responses. It should be emphasized that 
these calculations are based on the assumption of 
error-free performance. Of course, errors were 
made, Based on data collected during the experi- 
ment, the typical § who received the part-task 
program made an estimated 410 overt, task-rele- 
vant responses while the typical S who received 
whole-task training made about 620 such Tesponses. 


Procedure 


The training and the test roblems wer 
presented by three female geadduata sekistante: at 
of whom had had 15 or more hours experience 
training and testing children prior to the experi- 
ment. The author monitored 1-2 hours of each 
E’s_ preexperimental training and testing per- 
formance. Several staff conferences were held in 
which the letter and spirit of the procedures were 
detailed, ambiguities resolved, and difficult prob- 
lems discussed. Tn addition, each E had a 10-page 
manual Siving an overview of the experiment and 
ener ene seas and a 7-page 

. etting fo: ie pros dminis- 
tering test problems, Pieces 


Each child was trained and tested by a single 
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Z. One-third of the Ss under each treatment in 
the experiment were run by each FZ. Training and 
testing sessions were scheduled to be 20 minutes in 
length. Unless the child was sick or some other 
circumstances such as a special school program in- 
tervened, the child received three sessions a week 
until he completed the training and the testing, 
Most sessions were conducted at three widely 
separated stations in a large general-purpose room 
in the cooperating elementary school. 

For both part-task training and whole-task 
training there was a mimeographed copy of the 
program for each child. The child did not read the 
program. Rather, the program was a script that 
guided the behavior of Z. Except as otherwise in- 
dicated, Z adhered closely to the program, which 
described the stimulus S was to see, contained the 
verbatum language H was to use, and indicated 
the response or responses S was to give. 

Under both training methods Z made generous 
use of social reinforcement. The frequency and 
contingencies of reinforcement were not expressly 
indicated within the programs, but instead were 
under the extemporaneous control of Z. Particu- 
larly with respect to the whole training procedure, 
which was quite aversive for some Ss (at the be- 
ginning of training, especially) 2 was coached 
to maintain a pleasant, nonjudgmental posture in 
the face of poor performance, and to find every 
opportunity to reinforce. Overall, Z probably gave 
supplementary social reinforcement (in addition 
to feedback) for about every third correct re- 
sponse or chain of correct responses, except when 
S was doing poorly, in which case every correct 
response was reinforced. 


Presentation and scoring of test problems 


The procedure for presenting problems was il- 
lustrated with the cowboy materials on a preceding 
page. The E presented a “focus instance,” that is, 
an exemplar of the concept. The S then selected 
instances until he could name the concept or until 
he had selected six instances without being able 
to name the concept, at which point the problem 
terminated. Each time S pointed to an instance, 
E told him whether the instance showed the con- 
cept (positive instance) or did not show the con- 
cept (negative instance). If S stopped trying to 
solve the problem or stated an incorrect conclu- 
sion with which he was apparently satisfied, 2 at~ 
tempted to keep him performing with one of 
series of standardized prompts. For example, 
S made no task-relevant responses for a period i 
10-15 seconds, E said “What are you going to e 
now?” If § stated an incorrect concept and ma ‘ 
no further task-relevant responses for 10-15 o 
onds, Z said “Are you sure that’s my sere ‘ 
When these prompts failed to elicit further ei 
havior, E presented a stronger prompt. If 8 se 
cumulated enough evidence to solve the ee, 
but did not volunteer a conclusion, E said i 
soon as you are sure you know, tell me the pee 
The language of all the prompts and the com 
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gencies for their use were prescribed. The EZ was 
permitted no unstandardized remarks. No feed- 
back was given as to the correctness of responses 
to test problems, except as was indirectly involved 
in procedures already described. 

‘At the beginning of each experimental session 
in which test problems were presented, S first 
solved a series of simpler problems designed to 
provide warmup, to furnish a high initial frequency 
of reinforcement, and to make sure that the con- 
cepts that S would subsequently have to attain 
were in his repertoire. In this orientation ex- 
ercise, H named concepts while S pointed to all 
of the positive instances of each concept. There 
were 16 concepts treated in this manner including 
the 8 which would have to be attained in order 
to solve the problems later in the session. If S 
made an error he was prompted to make the cor- 
rect response and, in addition, any item upon 
which an error occurred was repeated after an in- 
terval until S made an unprompted correct re- 
sponse. 

The EZ wrote a protocol for each test problem, 
recording in coded form the instances S selected, 
the statements S made, and the statements 7 
made in the sequence in which these occurred. Im- 
mediately following a day’s administration of 
problems, Hs exchanged protocols to check them 
for legibility, completeness, accuracy of subject 
ae problem identification, and accuracy of cod- 

ig. 

An 8 solved a problem when he selected a series 
of instances and stated a conclusion such that the 
instances implied the conclusion and no other. 
Performance on terminal problems was scored on 
4 3-point scale as follows: (a) S solves the problem 
and neither makes any logically unnecessary 
ee of instances nor states any incorrect con- 
he (2 points); (b) S solves the problem but 
i es one or more unnecessary choices or states 
ne more incorrect conclusions (1 point); (¢) 

ails to solve the problem (0 points). 
as @ protocols were punched on cards and then 

ored on an IBM 1620 computer using a program 
Written for thi i 
mechanical is purpose. One of the virtues of 
track cal processing was to provide a means for 
hich down and eliminating the clerical errors 
ne area arise when a large quantity of 
Rate pet material must be analyzed. An elab- 
aan ee of internal consistency checks was 
4 arias program. The computer rejected 
Glaser ids , OF 2.5% of the total number of proto- 
ibe se of coding errors. A keypunch error 
oy ag made in 11 cases. In the remaining 10 
is e fault was in the protocol. These proto- 
ihe the examined and decisions were made as to 
aly a codes should have been. Herein lies the 
sui pees of the analysis in which subjective 
nt was required. 
training “Tminal_ problems included within the 
point programs were scored on the same 3- 
be Scale as the test problems, However, prob- 
Presented during training were scored on the 
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spot by £ and only an abbreviated protocol was 
written, 


Design and Subjects 


There were three treatment conditions. One 
group (P) received part-task training. Another 
group (W) received whole-task training. A con- 
trol group (C) received no treatment. Every S§ in 
the former two groups received training with two 
sets of materials, One-sixth of the Ss in each of the 
training groups was assigned to one of the six 
possible permutations of the three sets of train- 
ing materials taken two at a time. About 48 hours 
after completing training, Ss in the training groups 
received eight test problems to assess retention. 
The retention problems involved the second set 
of materials with which S received training. Hach 
control S received the retention problems during 
his first experimental session. One-third of the 
control Ss received problems involving each of the 
three sets of materials. 

About 48 hours after receiving the retention 
problems Ss received a series of eight test prob- 
lems to assess transfer of training. The transfer 
problems entailed the set of materials which $ 
had not encountered during training. One-third 
of the control Ss received transfer problems in- 
volving each of the three sets of materials, a dif- 
ferent set than was encountered during the re- 
tention problems. 

With respect to both the retention and transfer 
problems, § received 2 zero-dimensional problems, 
2 one-dimensional problems, 2 two-dimensional 
problems, and 2 three-dimensional problems. The 
order of presentation of problems was randomized 
for each S independently of other Ss. 

The Ss were 53 second-semester first graders 
from a predominantly middle-class school located 
in a new housing development on the outskirts of 
a Midwestern city of 30,000. These Ss were ran- 
domly selected from among all of the first graders 
in the school and randomly assigned to experi- 
mental conditions. Since the study was conducted 
over a 3-month period, the experimental condi- 
tions were scheduled in a predetermined. random 
order. There were 18 Ss in Group W and in Group 
C but only 17 in Group P. There was to have been 
an eighteenth S$ in this group; however, the last 
S to be run (who appeared to be making normal 
progress) had to be dropped because of the im- 
pending end of the school year. He was replaced 
by a dummy case at the cell mean to balance the 
design for statistical p i Bese 

Classroom teachers administered the California 
Test of Mental Maturity (Long Form, 1963 Re- 
vision). Unfortunately one teacher found it neces- 
sary to terminate the examination in the middle of 
a subtest because of inattention and disorderly be- 
havior, so it is not possible to report 1Qs or MAs. 
Raw score (not including Delayed Memory sub- 
test score) means and standard deviations were 
64.6 and 82 for Group P, 69.3 and 7.4 for Group 
W, and 664 and 7.0 for Group C (F = 1.77, df 
= 2/52, p > 05). 
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TABLE 1 
Mean Trarnine Time 1n Minutes 
First task Second task 
rae Part | Whole | Part | Whole 
training | training | training | training 
Cards 96.4 | 76.3 96.2 53.0 
Cowboys 144.9 | 93.2 | 81.7 65.3 
Pencils 135.9 | 98.2 | 107.7 | 85.2 
‘All materials | 125.7 | 89.2 | 95.2 | 67.8 


Note.—The SDs (estimated from MS, terms) 
were 25.9 for the first task and 24.4 for the second 
task. 


REsuLts 

Acquisition 

Table 1 contains mean training times. 
Table 2 contains mean percentage of pos- 
sible score on the last 12 training prob- 
lems, Analysis of variance indicated that 
on both tasks Group P performed signifi- 
cantly better (2 = .01) on the problems 
whereas Group W completed training in a 
significantly shorter period of time. Based 
on the estimates of number of responses 
made during training, which were de- 
scribed earlier, and the training times that 
appear in Table 1, it is estimated that 
during training Group P made relevant, 
overt responses at the rate of about 3.7 per 
minute. The rate for Group W is esti- 
mated to have been about 7.9 per minute. 

The mean training times for Group P 
were considerably higher than the times 
obtained during preexperimental develop- 
ment of the programs. Part of the dis- 


, TABLE 2 
Mma Purcentacr or Possratp Scorn on THE 
Last Twatve Traine Prosiems 


First task Second task 
Materials 
Part Whe 
! ole | Part | Whole 
Cards 88.9 | 56.9 | 63.9 | 57.6 
Cowboys 49.3 | 54.2 | 95.4 | 49.4 
Pencils | 84.0 | 43.7 | 81.2 | 63.2 
All materials | 74.1 | 51.6 | 76.9 | 54.4 


EO Sa ee aia als eS 
Note.—The S$Ds (estimated from MS. terms’ 
— 17.8 for the first task and 18.0 for the ee 
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crepancy was no doubt due to the fact that 
second graders were used in most of the 
developmental work while first graders were 
employed in the experiment. Also at two 
points in the versions of the program used 
in the experiment S had to reach a criterion 
before proceeding. These criteria were not 
part of the preexperimental procedure. 

It might be argued that if Group W had 
been allowed as much training time as 
Group P it would have performed as well 
on the terminal problems. Figure 1 pic- 
tures performance over blocks of 12 prob- 
lems for Group W. Notice that performance 
reaches an asymptote by the sixth or 
seventh block on the first task. Conse- 
quently, it seems highly improbable that 
further practice would have improved the 
performance of Group W very much. 

There were significant differences be- 
tween materials on the first training task 
due in large part to the relatively poor per- 
formance with the cowboys. We have ob- 
served that the story line employed with 
the cowboys tends to interfere with the 
problem solving of some children, who in- 
sist that the “friends of the sheriff” must 
have rifles or must ride horses. Other in- 
vestigators have made similar observations 
(Bruner et al., 1956, p. 111). Evidently by 
the time he reached the second task S had 
learned enough so that he was not dis- 
tracted by the story line. 

There were significant materials effects 
on training time due to the pencils. The 
pencils were handled by S and shuffled by 
E after each problem. These manipulations 
took time. 


Retention and Transfer 


Table 3 presents the means for the reten- 
tion and transfer problems. In both cases 
there were significant differences among 
treatments. Comparisons (2 = .01) he 
the Newman-Keuls procedure indicate 
that on retention problems Group P wa’ 
superior to both the other groups ee 
Group W was superior to Group C. On t : 
transfer problems Groups P and W pale 
not significantly different but both were su 
perior to Group C. +» the 

There were significant differences 19 


Part- Versus WHote-Task Procepures ror TEACHING 


MEAN PERCENT POSSIBLE SCORE 
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<——IST TASK————> <2ND TASK———> 


1.2 3.) 43.5505 


7 ad ine lS ane Ol rei aed: 


BLOCKS OF TWELVE PROBLEMS 
Fic. 1. Mean percentage of possible score on blocks of 12 training problems within the 


whole training group. 


difficulty of the transfer problems accord- 
ing to the number of relevant stimulus 
dimensions the problems entailed. The 
ge were 44.9, 49.1, 42.6, and 32.9% for 
he zero-, one-, two-, and three-dimen- 
sional problems, respectively. There was 
also a significant Materials < Dimensions 
Interaction for the transfer problems, due 


TABLE 3 
Mman Percentage oF Posstsie Scorz 
ON RETENTION AND TRANSFER 


PROBLEMS 
Materials Part Whole No 
training i training 
Retention 
ards 
62.5 57.3 10.4 
pemboys 86.5 35.4 18.8 
in =e ; 76.1 55.2 16.7 
Pee ne | 775.0 49.8 15.3 
Cards 
44.8 58.3 30.2 
ete 49.0 | 41.7 | 21.9 
hee 65.6 46.9 22.9 
materials 53.1 49.0 25.0 


| ee 
ae SDs (estimated from MS. terms) 
.3 for the retention problems and 18.8 for 


e transfei 
across ae when scores are pooled 


primarily, for reasons which are not clear 
to the author, to the relatively great diffi- 
pag of the zero-dimensional card prob- 
lems. 


Discussion 


The results indicate some limits to the 
generality of the rule proposed by Naylor 
and Briggs that whole training will be su- 
perior to part training for “highly orga- 
nized” tasks. Of course it may be that there 
are characteristics of tasks, such as the 
amount and nature of its organization, 
which systematically interact with the 
length and complexity of the responses 
required from S at various stages during 
training, but this is evidently a matter 
about which there is much to be learned. 
The present author is pessimistic about the 
likelihood that broad generalizations con- 
cerning method-task interactions will 
emerge in the near future. Too much de- 
pends upon the specific features of the 
methods and the details of implementation 
of these features. 

Both informal observation and the ob- 
jective data suggest that in the present 
study Ss who received whole training did 
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not acquire a systematic instance-selection 
skill, or, at least, that they did not ac- 
quire the same skill as was acquired by 
most Ss who received part training. The 
whole group made a mean of 1.88 logi- 
cally unnecessary choices of instances per 
retention problem, whereas the mean was 
1.08 for the part group (¢ = 3.19, df = 
33, p < .01). Furthermore, there is reason 
to believe that there were qualitative dif- 
ferences in the conclusion-drawing behay- 
ior of the two groups. Most Ss in the part 
group learned to draw conclusions based 
on a restricted set of instances containing 
only the minimal information logically 
necessary to solve the problem. The Ss in 
the whole group seldom gave conclusions 
until they had selected a larger-than-logi- 
cally-necessary set of instances. The typi- 
cal S in Group W rapidly and, seemingly, 
haphazardly selected instances until a con- 
clusion occurred to him. Most Ss in Group 
P, on the other hand, selected instances 
slowly and their behavior usually con- 
formed to the method of varying each fac- 
tor in succession while holding all other 
factors constant, At the point at which 
just enough information was available log- 
ically to solve the problem, the typical S 
who received part training usually offered 
a conclusion. The marked differences be- 
tween Groups P and W in rate of response 
during training can be traced to the con- 
trasting patterns of behavior typical of Ss 
in the two groups. 

It seemed possible that children who re- 
ceived part training would be able to solve 
a high percentage of the problems created 
with a new set of materials, Obviously this 
did not happen, indicating the need for a 
more refined method of producing general- 
ized stimulus control (Anderson, 1965). 

Group P did not do as well as would be 
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expected on the basis of the preexperimen- 
tal data, probably because of the fact that 
first graders participated in the experiment 
whereas second graders were employed 
during program development. Actually the 
part procedure was successful with the ma- 
jority of the first graders. Of the 17 Ss who 
completed part training, 11 scored 80% or 
better on the retention problems and only 
three scored below 50%. The latter three 
fell at the bottom of the distribution of 
aptitude test scores. 

Unnecessarily small and redundant steps 
and failure to provide for integration of 
subskills and concepts may be shortcom- 
ings of any particular lesson employing a 
small-step procedure. The guidelines for 
lesson development which have emerged 
from the programmed instruction movement 
have not been demonstrated to guard 
against these shortcomings; indeed, it is 
possible that such deficiencies, particularly 
unnecessary redundancy, are endemic in 
currently available small-step, programed 
lessons. In the present instance, a small- 
step procedure worked relatively well. 
Whether a small-step, programed pro- 
cedure would consistently prove best in 
other instances remains to be seen. 
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Instructions that varied according to purpose and amount of informa- 
tion were manipulated in this experiment. 1 set of instructions out- 
lined a conservative focusing strategy, another set described the 
structure of the stimulus material, and a 3rd set presented only mini- 
mum information about the task. 102 Ss between 19 and 30 years of 
age were assigned randomly to 6 treatment groups of 17 each. 2 groups 
received each set of instructions; in addition, 1 group of each 2 
was given information about a principle for securing information; the 
other was not. Concept attainment was more efficient for the 3 groups 


AND 


receiving the principle 3, also the rank order of the effects of instruction 
from most to least efficient was strategy instructions, structure instruc- 


tions, and minimum instructions. 


Verbal instructions are given to subjects 
(Ss) in experiments to facilitate their per- 
formance of the experimental task. Until 
recently, however, instructions have not 
been manipulated systematically to deter- 
mine their effects. In part, lack of sufficient 
attention to the critical role of instructions 
in experiments is related to failure in 
specifying clearly the dimensions on which 
instructions may vary. 

Instructions may vary according to pur- 
pose, method of presentation, amount of 
awa presented, specificity of the in- 
ormation presented, and amount of non- 
verbal guidance. The latter three dimen- 
Slons are relative and cannot be described 
Sa except in connection with a spe- 
bet experiment. The method of presenting 
Hf ean however, may be audio, visual, 
haa iovisual. Instructions may be formu- 
ae to achieve various purposes: (a) to 
H | ‘aH S with the specific stimulus mate- 
(b) ” e more general task characteristics, 
ane SN ad S with the specific Tesponse 
aps ore general performance desired, 
eae S with information of a 
Solty type, such as a strategy, or 4 


1 
eee reported herein was performed 
abd REE 20 United States 
Cation, lucation, Department of Health, Edu- 
in ee Welfare. This study is one of a series 
avelonen, learning of the Wisconsin Research and 
Ie the at Center for Cognitive Learning under 
Nee Ries of the senior author and part of the 
oral quncred were included as part of the doc- 
Ssertation of the junior author. 


method, to apply to solution of the task, . 
(d) to provide S with information of a sub- 
stantive type, such as an advance organizer 
or a principle, to employ in performing the 
task, (e) to provide a set related to the 
recall or use of information or abilities, and 
(f) to manipulate the level of motivation of 


Purposes a and b deal with clarification 
of the nature of the task and assume some 
degree of unfamiliarity of the task by 8. 
Instructions used in experiments typically 
provide information concerning the nature 
of the stimulus, the response, or both. Pur- 
poses c and d involve the presentation of 
additional information, usually designed to 
facilitate performance of the task by 8. 
Some research has been done with principles 
and advance organizers. Purposes @ and f 
involve an attempt to directly manipulate 
thought processes or perceptions of S, For 
example, instructions may be designed to 
encourage S to recall relevant information 
or abilities that may be used in the present 
task. Also, instructions may be designed to 
produce varying amounts of stress in Ss, to 
indicate a reward system, to focus atten- 
tion, or to produce other conditions assumed 
to be related to the level of motivation. 

Research on instructions in experiments 
on concept learning jig meager. The first 
experiment using instructions as an in- 
dependent variable was by Archer, Bourne, 
and Brown (1955). They compared instruc- 
tions encouraging analytic problem solving 
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to instructions encouraging nonanalytic 
problem solving. Analytic instructions did 
not significantly improve overall perform- 
ance, but reduced variability and facilitated 
performance on the more complex tasks. 
Braley (1963) found no difference in effi- 
ciency of concept attainment between in- 
structions oriented toward concept attain- 
ment procedures and those oriented toward 
problem solving. In contrast to the fore- 
going studies, Tagatz (1963) reported in- 
structions and training with one type of 
stimulus material to have a negative effect 
upon performance with a second type of 
stimulus material. Thus, the effects of in- 
structions on concept attainment are not 
clearly established. However, instructions 
are being used in many experiments on con- 
cept learning and are also of primary im- 
portance in curriculum research and devel- 
opment. 

Tnstructions that convey information 
about a principle have a fairly consistent 
history of facilitating initial learning and 
transfer, Hilgard, Irvine, and Whipple 
(1953) had one group of Ss learn to per- 
form card tricks by memorizing a sequence 
while the other group learned a procedure 
for determining the sequence. Initial acqui- 
sition was faster for the Memory group, and 
there were no differences in retention of the 
first trick. However, significantly more Ss 
in the understanding group performed the 
second trick on the retention test, The 
understanding group, moreover, demon- 
strated significantly more transfer than the 
memory group to new tasks where simple 
transposition was not an effective aid to 
solution. Sassenrath (1959) found that 
learning to learn a principle during train- 
ing facilitated learning to learn a reversal 
principle during the transfer period. Other 
experiments have also demonstrated the 
iain Aan of the knowledge of a 
principle (Craig, 1956; For 
1057; Haslerad & Meyers (ase. een 
son & Schroeder, 
results of many 
clear to predict 
phd : principle facilitates learning, In the 

experiment, this prediction was 
it J sets of instructions that 
varied according to purpose, 
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The three sets of instructions developed 
for this experiment were varied deliberately 
according to purpose and indirectly in the 
amount of information provided. Th 
method of presentation, specificity of in 
formation, and amount of nonverbal guid. 
ance were held constant. One set, of in 
structions of about 400 words contained the 
minimum amount of information regarding 
the stimulus material and the desired re. 
sponses necessary for the Ss to proceed 
with the task. The second set of instruc. 
tions of about 500 words incorporated the 
minimum instructions and more complete 
information regarding the organization, or 
structure, of the stimulus material. The 
third set of instructions of about 850 words 
incorporated the preceding and also a de- 
scription of a conservative focusing strat- 
egy. The three sets of instructions and the 
instructions regarding the principle are 
included later in the method section of this 
article. t 

The prediction was that the strategy in- 
structions would be associated with best 
performance, structure instructions next, 
and minimal instructions with least efficient 
performance. The prediction was based on 


the assumption that instructions could be 


written that would have a facilitative 
rather than an interfering effect, Also, the 
work of Bruner, Goodnow, and Austin 
(1956) suggests that an understanding of 
the relationship of the attributes and values 
incorporated in the stimulus material facil- 
itates performance. In addition, the use of 
a conservative focusing strategy insures the 
attainment of the concept with greatest 


certainty that the concept identified by 5 


as correct is in fact correct. Neither of these 
propositions regarding structure and strat: 
egy, however, has been tested experimen: 
tally. It may be observed also that the con- 
servative focusing strategy could not 


Teadily taught without an understanding 


of the structure of the material. 


MertHop 


Subjects 


The Ss were enrolled in educational payee 
classes at the University of Wisconsin. They ¥° 
assigned at random to each of six treatments Ne th 
the restrictions of a proportional number of @ 
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gex in each group, and an equal total number of Ss 
in each group. There were 102 Ss (90 females, 12 
males) ranging in age from 19 to 30 years. 


Brperimental Material 


The concepts were embedded in figural material 
on 128 stimulus cards, each 3 inches square. These 
cards were arranged in 16 rows and 8 columns on a 
Jarge board, Each card contained seven attributes 
with two defining characteristics, or values, in all 
combinations as follows: border number, one or 
two; border continuity, solid or broken; figure 
number, one or two; figure size, large or small; 
figure texture, solid or spotted; figure color, red 
or green; and figure shape, circle or ellipse. The 
material is described in detail in Klausmeier, 
Harris, and Wiersma (1964). The concepts to be 
attained were conjunctive with three relevant 
attributes; for example, two large circles. Each 
attained the same sequence of four concepts. 


Experimental. Procedure 


The Ss were scheduled individually to come to 
the learning laboratory. After a brief introduction 
to the experimenter (Z), the appropriate set of 
instructions was read to 8. There were six sets of 
instructions : minimal with principle, minimal 
without principle, structure with principle, struc- 
ture without principle, strategy with principle, and 
strategy without principle. Hach S was allowed to 
attain as many concepts as possible, up to & max- 
imum of 10, during a 55-minute period. 
_ The minimal instructions presented only enough 
information for S to understand that he was 
to attain concepts. The structure instructions in- 
corporated all the minimal information and also 
‘ofa the organization of the stimulus ma- 
a according to the seven attributes, each 

wing two defining characteristics as described 
“ere In addition to the description, the 
ae required S to demonstrate that he 
ca Pick out at least one card representing each 
a ute and defining characteristic. The strategy 
“aa incorporated all the information of the 
ao and structure instructions and also de- 
i a conservative focusing strategy for attain- 
coe by selecting successive cards that 
oa the focus card by only one defining 
ae eae, In addition, the instructions Te- 
iar to demonstrate that he could pick out 

: esthe cards that varied in only one de- 
aici a ated from the focus card that was 
ins enema experimental procedure for all 
for 8 ey was for E to present a focus card and 
ee select successive cards as belonging to the 

: a of which the focus card was a member. 
She (ee offer a hypothesis, his estimate of 
hake le concept was, at any time after making 

: = card choice. No time limit was given S. 
creed instructed, however, to get the correct 

pt as quickly as possible. 
minimal instructions, the additional para- 
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graph about the structure of the material, the two 
further paragraphs about the strategy, and the 
principle follow: 


Minimal Instructions 


This experiment is concerned with how people 
attain concepts. Your task is to attain several 
concepts that I have in mind. I am going to 
teach you how to attain the concepts. Your per- 
formance today in no way reflects on your in- 
telligence. Also, there is nothing tricky about 
the experiment. 

Now let us define what a concept is on this 
board. Concepts on this board are stated in terms 
of one or more of seven attributes listed on this 
slip of paper. For example, all the cards con- 
taining circular figures from the concept, circles, 
Show me four cards which belong to this con- 
cept. [The # waits until S points out four ex- 
amples.) That’s correct. Consider another ex- 
ample. All the cards with small red elliptical 
figures form the concept, small red _ ellipses. 

in Z makes sure S knows four examples of 
yes-cards.) That’s correct. A very large number 
of concepts can be formed, having one, two, or 
any combination of the seven attributes that are 
listed on your slip of paper. Please state a one-, 
then a two-, and then a three-attribute con- 
cept.... That’s fine. In this experiment, your 
job is to attain concepts of the type we have 
just discussed. Do you have any question about 
what a concept is? 

Listen carefully now to the procedure for 
attaining a concept that I have in mind, I shall 
indicate one card which belongs to the concept. 
This card we shall call the focus card, This focus 
card contains all seven attributes, and part of 
these seven attributes form the concept I have 
jn mind. Your job is to test attributes of other 
cards in relation to the focus card to determine 
which attributes form the concept. Read off the 
number of the card you are checking and I shall 
tell you “yes” if it belongs to the concept and 
“no” if it does not belong to the concept. ‘ 

When you think you know the concept, mark 
it on the slip of paper and give it to me. If the 
concept is correct, the task is completed. If not, 
Til simply say “not correct, continue” and you 
will continue selecting cards until you again 
offer a concept. Use the slip of paper at all times 
and take it to the board with you. 

Your job is to ascertain the correct concept 
as quickly as ible. Do you have any ques- 


tions? 


Structure Instructions 


Now, let us examine the structure of the ma- 
terial. [The Z points to the board.] There are 
428 cards on this board. This slip of paper lists 
the attributes contained on each card on the 
poard. Pick any one card on the board and point 
out the seven attributes, listed on your slip of 
paper, which are also on the card. Do your check- 
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ing aloud so that I may follow you. Start with 
number of borders. [The HZ checks that S fol- 
lows the sequence given on the slip and checks 
all attributes. Do you understand that any 
card you would have chosen would have all the 
seven attributes? Do you have any questions on 
the seven attributes contained on each card? 


Strategy Instructions 


The best way to ascertain the concept is to 
select your successive cards so that each card is 
exactly like the focus card, except for one at- 
tribute. Let us start with card #168 as an ex- 
ample of a focus card. To check the attribute, 
number of borders, we may go to card #61. Note 
that card #61 is exactly like the focus card ex- 
cept for the number of borders. To check the 
type of border we might go to card #95. Note 
that card #95 is exactly like the focus card ex- 
cept for type of border. Referring to your slip 
of paper, you tell me a card that differs by one 
attribute only for each of the five remaining at- 
tributes listed. Start with number of figures 
(6-no. of figures, 48-size, 88-texture, 105-color, 
12-shape). You have just finished varying all 
seven attributes, one at a time, from the focus 
card. You should note that the cards which 
vary one attribute lie in the same row or column 
as the focus card. Do you have any questions 
about how to vary the seven attributes, one at 
a time, from the focus card? 

If you had been trying to attain a concept of 
one or more attributes for which card #168 was 
the focus card, you would have proceeded in 
exactly the same manner. However, to some of 
your card choices I would have responded with 
“yes” and to others “no.” On the basis of all the 
“yeses” and “nos” you could have determined 
the concept I had in mind. 


The Principle 


To get any concept I haye in mind, you must 
use both the “yes” and “no” cards and there is 
an important principle to learn. The principle is 
that when you vary one attribute from the focus 
card and I give you a “no,” the attribute of the 
focus card varied is part of the concept. When 
you vary one attribute from the focus card and 
I give you a “yes,” the attribute is not part of 
the concept. For example, suppose #168 is the 
focus card and you select #61 which is like ¥168 
except that it has two borders. If I said “no” to 
card #61 you would know that one border is 
part of the concept since the focus card is 
exactly like #61 except it has one border. How- 
ever, if I said “yes” to #61 you would know that 
number of borders is not part of the concept 
since it makes no difference that #168 has one 
border and #61 has two borders. Let us try 
another example. Suppose #168 is the focus 
card and $12 is a “no” card. What does this 
tell you concerning the concept? ...Circle igs 
part of the concept since this is the only at- 
tribute varied from card #168. 
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This principle involving my giving you 
“no” applies only when one attribute is varied 
from the focus card. Suppose I respond “no” 
to card #36. You could infer nothing about the 
concept. Can you tell me why? Correct. Because 
two attributes from the focus card were varied; 
size and texture. Do you have any questions 
about how to use the principle to use the ‘no’ 
cards to determine the concept? 


Any question by S was answered by reading 
again the appropriate part of the instructions, 
When there were no further questions, the focus 
eard for the first problem was presented. Time 
was kept from when S located the focus card until 
the correct concept was offered. The card choices 
and hypotheses offered by S were recorded serially, 
This permitted tallying the total number of card 
choices and the total number of hypotheses offered, 


Experimental Design 


A 2X34 factorial design with repeated meas- 
ures was used. The bileveled factor was prinicple 
versus no principle and the trileveled factor was 
instructions: minimal, structure, or strategy. The 
repeated measure was on each of four concepts 
attempted by each S. Consequently, there were 
17 Ss per cell with repeated measurements to a& 
certain the effect of ordinal position, or of prac- 
tice across trials. 


Rusvts 


Since not all Ss attained the same num- 
ber of concepts, the analyses were carried 
out on the first four concepts attempted. 
In Table 1 are shown the means for each 
of the four concepts for the six groups a¢- 
cording to principle or no principle and the 
type of instructions. From the data in Table 
1, one can readily see that Ss became more 
efficient with practice and that Ss who 
received instructions incorporating the con- 
servative strategy and the principle were 
the most efficient concept attainers. 


TABLE 1 
Mzan Time 1n Szconps on Four Concirts 


Concept 
Treatment SS 
A B c D 
es 
Principle 
Strategy 378 | 265 | 107 He 
Structure 513 352 | 206 158 
Minimal 516 | 440 | 177 
No principle 
Strategy 504 | 420 | 202 
Structure 443 | 393 | 222 265 
Minimal 523 | 506 | 297 


cept attainment and that Ss improve per- 
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The results of the analysis of variance 
using time to criterion as the dependent 
variable are shown in Table 2. The F ratio 
for the effect of a principle is significant be- 
yond the 05 level. Similarly, the F ratio 
for the three types of instructions is signifi- 
cant beyond the .05 level. The effect for the 
ordinal position of the concepts, designated 
concepts in Table 2, is significant beyond 
the 01 level. These results indicate that 
instructions with a principle facilitate con- 


formance with practice. This practice ef- 
fect is, however, confounded with the 
difficulty level of the concepts. Since there 
is no basis for assuming that the concepts 
were of unequal difficulty, the improvement 
may be attributed to practice. 

A second dependent variable was the 
number of card choices S made in attain- 
ing each of the four concepts. Table 3 shows 
the mean number of card choices for each of 
the four concepts for the six groups of Ss 
according to principle or no principle and 
other instructions received. The fewest card 
choices were required by the groups receiv- 
ing the strategy instructions, and the other 
two groups made about the same number of 
choices. The groups instructed about the 
Principle required fewer choices than those 
not instructed. 

The results of the analysis of variance 
using the number of card choices as the 


TABLE 2 


ANALYsis or VARIANCE FoR Muan TIME 
in SECONDS 


Source of variation F ratio 


af 
xt 
Principle (P) z fi 
1 -63 
Tae (©) 3| 2,212/172 | s9.88** 
ee 2 86,460 | ns 
PxG : 13,839 ns 
Ixpx¢ dpe lln 
Sj oi within 6 17,871 | ns 
. 96 69,644 
Subjects (IxP) x G | 288 24612 
Total 408 
a ae eae a 
P< 05. 
“p< 01. 
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TABLE 3 
Mean Nomper or Carp CHoIces FOR 
Four Concerts 


Concept 
Treatment 
A B Cc D 

Principle 

Strategy 10.5] 9.7) 7.1] 8.1 

Structure 18.8 | 19.5 | 15.7 | 11.9 

Minimal 16.6 | 19.6 | 11.8 | 11.1 
No principle 

Strategy 13.4 | 12.3} 11.3] 9.5 

Structure 19.3 | 24.1 | 22.9 | 20.5 

Minimal 26.0 | 27.0 | 18.9 | 20.1 


dependent variable are shown in Table 4. 
The results here are essentially the same as 
when time to criterion was the dependent 
variable. The effects of principle—no princi- 
ple, of instructions according to the three 
purposes, and of the repeated measures of 
the concepts are significant, beyond .01 level, | 
indicating that the number of card choices 
was a more powerful measure than time to 
criterion. Apparently, the strategy instruc- 
tions and the use of the principle either en- 
couraged caution in making card choices or 
led to an increased amount of time to lo- 
cate the specific cards that varied only one 
attribute from the focus card. 

Table 5 shows the mean number of hy- 
potheses offered by the six treatment groups 
on each of the four concepts. The fewest 
hypotheses were offered by the three groups 


TABLE 4 
ANALYSIS OF VARIANCE FOR NuMBER OF 
Carp CHOICES 
aan as 
Source of variation af peer} F ratio 
ile phy 
Grand mean 1 | 105,378 
Instructions (I) 2 3,474 | 12.42* 
Principle (P. A 2,971 | 10.62* 
Concepts (C) 3 685 | 8.13* 
1p lise 2 252 ns 
Ixc 6 108 ns 
PxC 3 26 ns 
IXPXxXC 6 51 ns 
Subjects within (IXP) 96 279 
Subjects (IXP) X Cc 288 72 
Total 408 
*p < Ol. 
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TABLE 5 


Mean Noumser or HypotHesEes OFFERED 
on Four Concuprs 


Treatment 


Principle 
Strategy 
Structure 
Minimal 

No principle 
Strategy 
Structure 
Minimal 


receiving the principle. Fewest hypotheses 
prior to attaining the concepts were offered 
by the two groups receiving the strategy 
instructions. Those groups having the struc- 
ture and minimum instructions offered 
about the same number of hypotheses. 
Analysis of variance on the number of 
hypotheses to solution provides essentially 
the same results as for the preceding de- 
pendent variables, Effects beyond the .01 
level of significance were obtained for in- 
structions, the principle, and the concepts. 
Finally, the results using the dependent 
variable of amount of potential information 
available at first hypothesis were consid- 
ered. The amount of information each S 
could have potentially obtained at the time 
of offering a first hypothesis was deter- 
mined by examining and comparing all his 


TABLE 6 


ANALYSIS oF VARIANCE FoR NoumBer or 
Hyrornuses Orrrrep 


Source of variation af | Mean | P ratio 
pear ak ee a ide Gratien 

Grand Mean 1} 1,281 
Instructions (¢9) 2 15.5 | 8.31* 
Principle (P) 1 25.0 | 13.37* 
Concepts (C) 3 14.7 | 12.13* 
Le 2 0.5] ns 
Ixc 6 1.9]. ns 
PXxXCc 3 1.4] ng 
DPC 6 0.4] ns 
Subjects within (IXP) 96 1.9 
Subjects (IXP) x G 288 1.2 

Total 408 
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TABLE 7 
Mean Amount or INnFoRMATION Porentranty 
Securep ar Time Sussecr Orrerep Firgp 
Hyporuesis on Four Concepts 


Concept 
Treatment 


Principle 
Strategy 7 
Structure 5 
Minimal 5. 

No principle 
Strategy 6 
Structure 4 
Minimal 4 


choices. The amount of information corre. 
sponds to the number of attributes tested 
and could range from one to seven. Table? 
shows the mean amount of information po- 
tentially obtained by each group. The strat- 
egy group with the principle obtained all 
the potential information necessary to at- 
tain the concept prior to stating a first hy- 
pothesis. No other group reached this level 
of proficiency. 

Table 8 shows effects at the .01 level of 
significance for instructions, the principle, 
and the successive concepts. There was 4 
significant Instructions x Concepts interac- 
tion. Apparently, this interaction is a Te 
flection of poorer performance on the first 
two concepts by the structure group than 
the minimal group and a reversal of this 


TABLE 8 
ANALYsIS oF VARIANCE FOR AMOUNT OF 
Inrormation PorenriaALLy SECURED 
at First Hyporsssis 
Source of variation df |Mean square| F ratio 
tes 
Grand Mean 1 | 15,688 “at 
Tnstructions (I) 2 26.8 Pt 
Principle (P) 1 30.8 a 
Concepts (C) 3 28.9 | 24. 
IxP 2 1.0) mi, 
Ixc 6 3.9 | 3.31" | 
PXxXcCc 3 1.6 |, "8 
IXPXxXC 6 1.0 ns 
Subjects within (IxP) | 96 2.3 
Subjects (IxP) x C | 288 1.2 
Total 408 


"Scene eae a a LE 
*p< 01. 


rend on the third and fourth concepts, with 
greater gains accruing for the structure 
group than the minimal group. 


Discussion 


As predicted, concept attainment _was 
facilitated least by minimal instructions, 
nest by instructions that presented infor- 
mation about the structure of the stimulus 
material, and most by instructions designed 
to teach a conservative focusing strategy. 
(Qn all dependent variables, performance 
improved across trials, or concepts. One 
cannot determine experimentally whether 
the improvement was a function of practice 
or of the difficulty level of the concepts. In- 
asmuch as there is no basis for assuming 
tmequal difficulty of the concepts and since 
the identical sequence of concepts was used 
with all Ss, one may assume that the im- 
provement resulted from practice in attain- 
ing concepts of the conjunctive type. These 
results are important in terms of under- 
standing, predicting, and controlling con- 
cept attainment in laboratory studies and 
also have implications for school learning 
of concepts. 
_The effect of practice on mean solution 
times was greatest from the second to the 
third concept. This might mean that posi- 
tive transfer was maximized on the third 
concept. Clearly there was positive transfer 
from the first concept to successive con- 
cepts. In addition, the effect of interproblem 
transfer was lesser for instructions without 
the principle and greater for instructions 
with the principle. Thus, knowledge of the 
Principle, especially how to use information 
Provided by negative instances, was ass0- 
— with consistently improved perform- 
4 ce when time to criterion was used as the 
pendent measure. 
ee instructions were closely ass0- 
ton with efficient performance in terms 
a poet card choices. With and with- 
ais : principle, strategy instructions re- 
= . AN fewest card choices. It should be 
asia d that the conservative focusing 
ie Pi might require more time to locate 
a at varied in only one attribute from 
aa card but that fewer card choices 
be required to attain the concept. A 
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further interesting observation is that 
knowledge of the principle facilitated per- 
formance more than knowledge of the 
structure. Apparently knowledge of the 
principle enabled Ss to secure more informa- 
tion from fewer card choices than did 
knowledge about the structure of the ma- 
terial. This may be interpreted as indicating 
that the instructions for the principle were 
better written or better understood by Ss 
than were those for structure. The intent 
was to write them equally well. 

The combined effects of the conservative 
strategy and principle on hypothesizing be- 
havior were dramatic. The Ss who had 
received these instructions potentially ob- 
tained all of the relevant information be- 
fore offering an hypothesis, even on the 
first concept. No other group was as profi- 
cient, even on the fourth concept. 

The importance of cognitive control of 
human behavior has been described by 
Miller, Galanter, and Pribram (1960). They 
stated that all human behavior is controlled 
by a plan, which in turn is comprised of 
strategies and tactics. Strategy refers to the 
control of the global units of behavior and 
tactics to the specific units. Bruner et al. 
(1956) reported the first extensive research 
with strategies in concept attainment and 
identified four strategies. Apparently the 
same person may use any of the four strate- 
gies in attaining concepts depending on 
situational conditions. The present experi- 
ment provides conclusive experimental evi- 
dence that not only a strategy but a princi- 
ple for securing information efficiently can 
be described, and that providing this infor- 
mation to Ss markedly facilitates concept 
attainment over a short period of time. 

The present experiment provides en- 
couragement for moving ahead more rapidly 
on research in school settings in which at- 
tempts are made to develop and test in- 
structions designed to facilitate concept 
learning in various subject fields. Even 
without further research it is safe to con- 
clude that more time should be spent on 
teaching students how to learn concepts, 
the organization of the subject matter to 
be learned, and principles for securing and 
utilizing information. Further, educational 
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research on various methods of teaching has 
typically resulted in statistically nonsignifi- 
cant results, whereas relatively small 
changes in instructions yield highly signifi- 
cant results in laboratory settings. Educa- 
tional researchers apparently must find 
better means for controlling learning con- 
ditions in school settings, executing various 
treatments more systematically, and de- 
veloping more sensitive measures of per- 
formance. Educational psychologists and 
other educational researchers have not 
demonstrated much inventiveness during 
the past decades in any of these matters. 
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EFFECTIVENESS OF STUDY-SKILLS INSTRUCTION 


FOR HIGH SCHOOL SOPHOMORES! 
WARREN L. HASLAM anv WILLIAM F. BROWN 
Southwest Texas State College 


The Brown-Holtzman Effective Study Course: High School Level 
was taught to 74 high school sophomores during the fall of 1965 at 


Highlands High School, San Antonio, Texas. 59 of the 74 students 
receiving the instruction were individually matched with a control 
group of 59 students not receiving such instruction. Matching of the 


The assumption that improved study 
habits and improved academic effective- 
ness will result from systematic study- 
skills instruction is intuitively appealing. 
To date, however, published research on 
the productivity of study-skills instruction 
for high school students appears to be al- 
Most nonexistent. The investigation being 
Teported was, therefore, undertaken to de- 
termine the effectiveness and acceptability 
of study-skills instruction for high school 
sophomores, 


Merxop 
Design 


it me pee project was designed to determine 
High Spon Holteman Effective Study Course: 
Hea ool Level could produce significant im- 
ak iat in the scholastic motivation, study be- 
op pit academic achievement of high school 

ie res: Experimental students were selected 

eived instruction in how-to-study whereas 


an indivi 
n individually matched control group was denied 


sue . 
the tattruction. Upon completion of the course, 


wi 
ae groups were compared on two 
nt indexes of instructional results. 


Subjects 


Th A 3 
Selecting experimental students, preference 


a8, 
—&lven to those students who indicated on an 


rhs 
Peay Was supported, in part, by a grant 

e Univer ee Foundation for Mental Health, 
on arty of Texas, Austin. Tt was based, in 
fenior mith unpublished master’s thesis by the 
1968, ‘or at Southwest Texas State College, 


2 groups was done on the basis of age, sex, race, intelligence quotient, 
subjects being studied, and 1st 9 weeks’ grade-point average. Admini: 
tration of the Survey of Study Habits and Attitudes before and after 
study-skills instruction indicated significant improvement in the meas- 
ured study orientation of the experimental group. Following the 
course, students in the experimental and control groups were compared 
on 2 indexes of instructional results—9-weeks’ course grades and scores 
on the Effective Study Test. The experimental group was found to be 
significantly higher on both indexes. 


application form a desire to continue their educa- 
tion beyond high school. Grades were also con- 
sidered, with preference given to students with 
average grades; although some above and below 
average students were also selected to insure a 
representative cross-section of the student body. 
Finally, students having eight or more absences 
recorded during the previous school year were dis- 
qualified to help insure high attendance in the 
course. Thirty of the 65 applicants enrolled in fifth 
period study hall were accepted for the program; 
44 of the 110 applicants for the after-school section 
were accepted. 


Procedure 


A mimeographed announcement describing the 
course objectives and content, the cost of mate- 
rials, and the application procedure was distributed 
to all sophomore English classes and was posted on 
appropriate bulletin boards around the high school 
to inform students about the Effective Study 
Course. Sophomore students interested in obtain- 
ing more information about the course were di- 
rected to meet in a designated room either before 
or after school during the week of October 25. At 
each meeting, the instructor explained the course 
in more detail, answered the students’ questions, 
and furnished interested students with application 
forms. The forms were taken home, completed, 
signed by either of their parents, and retumed, 
together with a fee for materi 

Two sections of the course were offered—one 
during the instructor’s normal conference period 
and the other after school. Sophomores enrolled in 
fifth period study hall were permitted to take the 
first, while all other applicants were required to 
take the section held after school. 

To facilitate selection of a control sample, 
sophomore English teachers had their students 
complete an information form giving the student's 
name, age, sex, Tace, college plans, and subjects 
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currently taken. Each student’s grade average and 
intelligence quotient was obtained from the coun- 
seling office and added to his form. From the ac- 
cumulated information, experimental and control 
students were individually matched on sex, race, 
age, intelligence quotients, first 9 weeks’ grade 
averages, and subjects currently taken. Matching 
of the experimental group with a control group 
was done so that the two groups could be com- 
pared on two subsequent indexes of instructional 
effectiveness—third 9-weeks’ course grades and 
scores on the Effective Study Test (Brown, 1964a). 
Of the original 74 students selected to receive the 
study-skills instruction, 59 students (experimental 
group) were finally matched with 59 students 
(control group) that did not receive study-skills 
instruction. The remaining 15 students were not 
included in the experimental group because it was 
not possible to match them individually with con- 
trol students using the established limits of the 
matching criteria. 

At the conclusion of instruction, each experi- 
mental student’s reaction to instructor effective- 
ness, course content, and program acceptability 
was determined by administering a course evalua- 
tion questionnaire. To ascertain the unit’s effect 
upon scholastic motivation and study behavior, 
the Survey of Study Habits and Attitudes (Brown 
& Holtzman, 1967) was administered to the ex- 
perimental group both before and after instruction 
and the resulting scores were compared. Upon com- 
pletion of the course, the Effective Study Test 
(Brown, 1964a) was administered to the experi- 
mental and control groups and test scores for the 
two groups were compared in order to determine 
their relative levels of study-skills knowledge. 
Finally, course grades for the experimental and 
control Ss were collected and analyzed to deter- 
mine the unit’s influence upon subsequent scho- 
lastic success. 


Course Description 


‘The Brown-Holtzman Effective Study Course: 
High School Level is the result of a 14-year in- 
vestigation into the factors determining scholastic 
success, The course is intended as a special in- 
structional program for students prior to, or im- 
mediately following, high school entrance. The in- 
structor’s role is to serve as a program moderator 
or discussion leader rather than as a lecturer on 
the subject matter. The instructor is expected to 
encourage maximum student participation in the 
learning process to achieve the following course 
objectives: (a) to motivate each student toward 
developing more effective study habits; (b) to 
improve each student’s study efficiency through 
better utilization of his study time 3 (c) to im- 
prove each student’s study efficiency through im- 
proved organization of his study environment: 
(d) to improve each student’s study efficiency 
through improved reading and writing techniques; 
(e) to improve each student’s efficiency in pre- 
paring for and taking examinations; (f) to im- 
prove the self-direction of each student through 
the development of meaningful and realistic aca- 
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demic goals; and (g) to help each student develop 
a realistic understanding of high school life and 
peer acceptance problems. 


Materials 


A variety of instructional and guidance mate- 
rials were used in the course. Three major items— 
the Survey of Study Habits and Attitudes (Brown 
& Holtzman, 1967), Effective Study Guide (Brown 
& Holtzman, 1964), and Effective Study Test 
(Brown, 1964a)—provided a systematic instruc 
tional program designed to help students improve 
their study skills. The Survey of Study Habits and 
Attitudes, hereafter referred to as SSHA, served 
as a motivating instrument; the Hffective Study 
Test, hereafter referred to as EST, was used as 
an evaluative and instructional instrument; and 
the Effective Study Guide served as the students’ 
textbook. Other instructional materials utilized in 
the course were the Effective Study Workbook 
(Brown, 1964b), Reading and Remembering Guide 
(Brown, 1962), Study Skills Surveys (Brown, 1965), 
Daily Activity Schedule (Brown, 1960), and Stu. 
dent-to-Student Tips (Brown, 1964d). 

A special Instructor's Manual (Brown, 19640) 
was employed to assist the teacher in planning 
the course content and using the instructional ma- 
terials. The manual consisted of 20 detailed lee 
son plans, each designed for a 55-minute class 
period, and appended materials designed to fe- 
eilitate and enrich instruction and discussion. The 
Instructor’s Manual suggested that the lesson plans 
could be presented in whatever order the instruc 
tor deemed suitable for his situation; however 
the 20 lesson plans were presented in the recom- 
mended order to insure an appropriate evaluation 
of the course. 


REsvuuTs AND DISCUSSION 


Pre-course and post-course administra- 
tions of the SSHA were employed to asses 
each experimental student’s study orienta- 
tion before and after instruction on how to 
study. The SSHA is an easily administered 
measure of study methods, motivation 4 
studying, and certain attitudes towar 
scholastic activities important in the ae 
room. Since the SSHA yields a 
study habits and study attitudes scores, * 
value lies in identifying, for each stud 
specific areas of deficient academic be 
havior and scholastic motivation W 
may handicap future school performiat , 
Table 1 reports the means and stan ee ; 
deviations for pre-course and post- cot 
scores on all seven SSHA scales. From ie 
table it may be noted that each of eo 
SSHA scales indicates a significant im 
provement in the study habits am 


tudes of the experimental group. Using 
Fisher’s t test for correlated samples, dif- 
ferences between the pre-course and post- 
course means and standard deviations were 
found to be significantly higher on all 
seven SSHA scales following study-skills 
instruction. 

The school administration felt that test- 
ing of the control students should be kept 
to a minimum so as not to interfere with 
their classroom instruction. Consequently, 
SSHA data were not collected on students 
in the control sample. Statistal data pre- 
sented in the SSHA manual were, there- 
fore, employed to help evaluate the 
course’s effectiveness. In a reliability study 
for 237 ninth graders, the SSHA manual 
reports that the mean total score de- 
creased 1.1 points and the standard devia- 
tion increased .3 points over a 4-week in- 
terval. By comparison, the means and 
standard deviations for the experimental 
group increased 43.3 and .7 points, re- 
spectively, over the same length of time. 
Comparison of the test-retest SSHA scores 
for these two samples clearly suggests that 
the results obtained for the experimental 
group should be attributed to the effective- 
ness of their study-skills instruction. 

_ The EST was administered to the ex- 


TABLE 1 

Comparison or Pre-Course anpD Post-CoursE 

URVEY OF Srupy Hairs anp ATTITUDES 
(SSHA) Scorms FoR THE 
EXPERIMENTAL GROUP 


Pre- Post. 
SSHA cate 're-Course® Course? ay ¥ 
M |SD| M | SD 
may avoidance | 21.3) 8.5] 34.2} 8.4}12.9]13.84* 
Gok methods | 21.9| 7.8] 34.5] 8.2|12.6)12. 90" 
mdy habits | 43.9/14.7) 68.7/15.5/25.5|14.90° 
eacher ap- 
Proval 12* 
Atcatonal ac. | 22°7| 89} 39-4) &-9| 9-7H0- 
Giese 30.1] 6.9] 38.1] 6-0} 8.0} 9.76* 
tidy attitudes | 59.8|14.2) 7.6]10.9|17.8)11.50* 
udy orienta- 
2 103.0]26. 5|146.3)25.8]43.3]14.59* 


ber 5 
Ioe5 testing was accomplished on December 6, 


*p < .001. 


: ie testing was accomplished on Novem- 
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TABLE 2 
Comparison oF Errzcrivs Stupy Test (EST) 
Scores ror THE EXPERIMENTAL 
AND ConTRoL SaMPLes 


Experi- | Control 
mental 
EST Scale Gray Grom Mp| + 
uM | sD| mu | sD 
Reality orienta- 
tion 21.2) 2.3) 19.2} 3.0) 2.0) 4.52* 
Study organiza- 
tion 21.6] 1.9) 16.6) 3.0} 5.0/13.12* 
Writing behavior | 21.0) 2.0) 18.4) 2.4] 2.6] 7.53* 
Reading behay- 
ior 20.4} 2.6) 18.1) 3.0] 2.3) 5.66* 
Examination be- 
havior 18,8] 2.5] 16.8] 2.3) 2.0} 4.34* 
Total study 
effectiveness {103.1} 7.1] 89.1] 9.7/14.0/11.55* 


pb Lie SE's) |? i) de ers Peal Peel 
= Testing was accomplished on December 2, 
1965 


b Testing was accomplished on November 
29-30, 1965. 
*p < .001. 


perimental and control groups upon com- 
pletion of the how-to-study course in order 
to permit a comparison of their knowledge 
about efficient study practices. The EST is 
specifically designed to measure a student’s 
knowledge about efficient study methods 
and the factors influencing their develop- 
ment, Table 2 reports the means and 
standard deviations of scores for the ex- 
perimental and control groups on all six 
EST scales. From this table it is evident 
that the experimental group was signifi- 
cantly more knowledgeable about efficient 
study techniques. Using Fisher’s € test for 
correlated samples, differences between the 
means and standard deviations for all six 
subscales were found to favor the experi- 
mental group at a significance level greater 
than .001. a3 ; 
The impact of study-skills instruction 
upon subsequent scholastic achievement 
was assessed by employing the grade-point 
averages for the first, second, and third 9 
weeks’ grading periods. Grade-point av- 
erages were calculated on the basis of 4, 8, 
2, 1, and 0 points for letter grades of A, 
B, C, D, and F, respectively. The experi- 
mental students’ grade average and stand- 
ard deviation during the first 9 weeks’ 
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grade period was 2.37 and .83, respectively. 
The control group’s mean and standard 
deviation for the same time period was 2.33 
and .84, respectively. During the third 9 
weeks, the experimental sample’s mean 
was 2.63 and the standard deviation was 
90. The third 9 weeks’ mean and standard 
deviation for the control students was 2.37 
and .86, respectively. The total grade-point 
average increase for the experimental 
group during the third 9-weeks’ period was 
26 and at a level of significance greater 
than .001. During the same time period, the 
control groups’ increase was .06 and at a 
significance level less than .10. 

The experimental group’s reaction to 
instructor effectiveness, course content, and 
program acceptability was evaluated upon 
completion of instruction by administering 
a specially constructed course evaluation 
questionnaire. Anonymous responses to the 
60-item questionnaire were tabulated and 
converted to percentages. Tabulation of re- 
sponses to all 60 statements indicated that 
the experimental students’ reactions were 
decisively positive to all evaluated aspects 
of the how-to-study course. 

In interpreting the research data, it 
should be kept in mind that motivation is 
an important factor influencing scholastic 
achievement, The motivation variable was 
not considered in matching the two groups 
of students, although motivation is par- 
tially implied in a student’s grade-point 
average, and grade-point average was one 
criterion used for matching. Although the 
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research method employed does have its 
limitations, the statistical analysis of data 
from the SSHA, EST, and grade-point ay- 
erages generally indicates a level of sig- 
nificance acceptable to most authorities, 
One may conclude from the research data 
that the study-skills instruction given to 
the sample of high school sophomores did 
increase their knowledge about effective 
study procedures, did improve their over- 
all study orientation, and did improve their 
subsequent academic achievement. 
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LONG-TERM CORRELATES OF CHILDREN’S LEARNING 
AND PROBLEM-SOLVING BEHAVIOR? 
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46 girls and 51 boys enrolled in the 7th grade were presented a series of 
learning and problem-solving tasks including paired-associate learn- 
ing, discrimination learning, incidental learning, concept of probability, 
conservation of volume, verbal memory, and anagrams. Only in inoie 
dental learning did the performance of girls exceed that of boys, but 
in all tasks except paired-associate learning and discrimination learn- 
ing sex differences were found in the correlations between performance 
in the experimental tasks and school grades for 1 term in English, 
social studies, science, and mathematics, When IQ was partialed out 
there was a greater decrease in the number of significant correlations 
for girls than for boys. Performance in the initial block of trials in 
the 2 paired-associate tasks correlated significantly with school grades 
for boys, but not for girls. A repetition of the study with 73 additional 
Ss resulted in similar general relations, but some differences in the 


specific patterns of correlations. 


Recent experimental studies of children’s 
learning and problem solving have had little 
influence on educational psychology, except 
when they have been derived from the 
theory of Piaget (e.g., Freyberg, 1966). 
This is unfortunate, for there have been 
Tapid advances during the past decade in 
our understanding of children’s cognitive 
Processes, The question is often raised 
Whether the experimental tasks used by 
A psychologists have any relevance to 
he problems of classroom learning, Experi- 
Mental studies usually involve short-term 
es of learning and problem 
ahaa in artificial, highly controlled situ- 

ons, while classroom learning occurs over 
ig pete of time and is concerned with 
em ay materials and skills. There is no 
__ous Teason to assume that the behavior 


"This study was su i 
ported in part by a grant 
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ree Health. This study reports data from 
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sampled in the laboratory should be quali- 
tatively different from that occurring in the 
classroom, but little attention has been paid 
to investigating this problem. Stake (1961) 
has reported the results of a factor analysis 
using school grades, scores on achievement, 
aptitude, and intelligence tests, and per- 
formance on learning tasks, and although 
he interprets his results in support of the 
definition of intelligence as the ability to 
learn, his data provide evidence for the 
predictive validity of learning tasks for 
school performance. A second study 
(Stevenson & Odom, 1965) investigated the 
relation between teacher’s ratings of chil- 
dren’s learning ability and children’s per- 
formance in paired-associate and discrimi- 
nation learning, concept formation, and 
anagram tasks. Highly significant. correla- 
tions were found for two of the five tasks: 
paired associates and anagrams. 
The present study provides additional 
information about the utility of Jearning 
and problem-solving tasks as predictors of 
school success. Seventh graders were given 
a series of tasks that were adapted from 
standard experimental learning and prob- 
lem-solving tasks for group presentation. 
Performance on these tasks was correlated 
with school grades and the influence of 
intellectual level was determined by par- 
tialing out IQ scores. Additional informa- 
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tion about children’s school performance 
was obtained from teachers’ ratings of the 
children’s learning ability. 


MetHop 


Subjects 


The subjects (Ss) were 46 girls and 51 boys 
enrolled in three seventh grade classrooms of a 
junior high school in a middle-class area of 
Minneapolis. The average CA of the boys was 
13.1 years (SD = 6), and of the girls, 13.0 years 
(SD = 4). The average IQ obtained from the 
verbal scale of the Lorge-Thorndike group test 
was 1012 for the boys (SD = 13.3), and 1042 
(SD = 14.1) for the girls. 


Grades 


The S's grades in four courses, English, social 
studies, science, and mathematics, were available 
for the quarter immediately preceding the study. 
The grades were coded so that an “F” was given 
a score of 1 and an “A+” was give a score of 13. 
The average grade was “C” (a score of 5.8), with 
a maximum difference of 1.7 points for the 
various courses. The standard deviations for the 
four courses varied from 2.9 to 42 for the boys 
and from 2.6 to 3.6 for the girls. For both sexes 
variability was smallest for English grades and 
greatest for science grades. 


Tasks 


This study reports data from eight learning 
and problem-solving tasks selected from a series 
of 13 tasks that were administered to Ss as part 
of a larger study. The tasks that were eliminated 
were those that failed to discriminate school per- 
formance, They included a task involving the dis- 
crimination of abstract forms, one involving the 
discrimination of classes of common objects, proba- 
bility learning, a test of the concepts of conjunc- 
tion and disjunction, and a test of children’s ability 
to estimate ages of adults. 

All of the tasks were administered to intact 
classrooms by means of sound movies, some in 
black and white and some in color, depending 
upon the nature of the task. A narrator introduced 
each task, giving all necessary instructions. This 
method provided a careful control for the con- 
ee of epee across classrooms. 

e Ss responded in booklets that correspo1 
to the content of each film. poh oe 

ach classroom was visited on 9 successi 
days, and the total time spent on a akedte 
varied from 20 to 30 minutes. Different orders of 
presentation were used in the three classrooms so 
that each: test was given at least once during the 
first 4 and once during the last 4 of the 9 days. 
The teacher introduced the experimenters (Es) 
as persons from the university who were interested 
in seeing how persons of Ss’ ages could per- 
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form on different kinds of tasks. The Ss were told 
that their performance would have no influence 
on their school grades. 

The following tasks were included: 

Paired associates (abstract words). Six stimulus. 
response pairs were used. Stimulus elements were 
trigrams with high association values (eg,, pag, 
kor) and response elements were common abstract 
words (eg., health, joy). The stimulus element 
of each pair appeared first on the screen, followed 
by the paired presentation of the stimulus and 
response elements. After the list had been pre- 
sented, the first page of the response booklet ap- 
peared on the screen and the narrator gave the 
instructions for responding, The projector was 
stopped while Ss circled the response element they 
thought was associated with each stimulus ele- 
ment. This procedure was followed for eight pres- 
entations of the list. 

Paired associates (abstract forms). This task 
was identical to the preceding one except that 
different nonsense syllables were used for the 
stimulus elements and relatively simple Japanese 
characters were used as the response elements. 

Discrimination learning. The S’s task was to 
discriminate common objects on the basis of their 
class membership. Four classes (ie., tools, food, 
toys, people) were represented by four objects. In 
turn, each of the classes was associated with one 
of four geometric shapes. Four objects, one from 
each class, appeared across the bottom of each 
page of the booklet, with one of the geometric 
shapes centered at the top. On each trial the cor- 
responding page of S’s booklet was shown on the 
screen, The Ss were instructed to choose the ob- 
ject they thought was correct and after a short 
interval Z pointed to the correct one on the screen. 
The correct pairings were arbitrary and differed 
across classrooms. A total of 64 trials was Pie 
sented. , had 

Incidental learning. An 8-minute skit ie 
simple plot was filmed in sound and color. T’ " 
plot allowed the elaboration of incidental aspec' 
of the film, such as dialogue, background actions; 
clothing, and set. The skit took place in & living 
room where a man and a woman were conversing 
about such matters as the content of the ea 
paper, the husband’s tobacco, and similar il ; 
topics. Reference was made to the whereabou' ies 
Andy, with the implication that he was the oot 
child. A visiting woman and a delivery ae ai 
peared as part of the plot, and gave ee of 
port to this implication. whee Andy finally 

red, he turned out to be a dog. 

Sine Instructions were given before the film, a 
it was always shown on the last day of cata 
under the guise of a reward for particip®# a 
the study. When the film was over, 8s eae 
a booklet containing 31 multiple-choice a9" 
false questions about the skit. Question tent 
asked about incidental aspects of the ° 

costuming, and physical setting of the skit. in color 

Concept of probability. This film was tain 
The narrator showed Ss two boxes, one © 
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ing 30 red pegs and one with 30 white pegs. He 
then interchanged 10 pegs from each box, leaving 
two complementary 2:1 ratios of the different 
colors in the two boxes. Seven questions were 
asked about various probability relations one 
might expect if blindfolded one drew pegs from 
the boxes in various manners. For example, the 
narrator asked how many red pegs he would get 
if he drew three pegs from Box A, the assortment 
of pegs that would be produced with successive 
reaches into Box B, and whether Box A or B 
would be most likely to yield a red peg if only 
one could be drawn. 

Conservation of volume. This task was a modifi- 
cation of the standard Piaget task. The film was 
in color, and the narrator presented two beakers 
of colored water, with several black lines indicat- 
ing various levels on the beakers. Four questions 
were asked about the amount of water that would 
be displaced if a ball of clay were dropped in the 
second beaker after an identical ball had been 
dropped in the first. The first question used the 
ball of clay as it was, while the three remaining 
questions were asked about the level to which the 
water would rise when the clay had been modified 
in several ways: rolled into a sausage shape, 
squashed into a pancake, and cut into small pieces. 

Verbal memory. This task was derived from 
the “Memory for Stories I: The School Concert” 
at Year X of Form M of the Revised Stanford- 
Binet. The narrator read the instructions, indicat- 
ing 8s would be asked questions about the story, 
and then read the one-paragraph story. The 
standard questions contained in this subtest were 
Puinted in Ss’ booklets. The maximum number of 
Points was 14, 
eiutame: The Ss were asked to make as many 
Nth ag Possible in 8 minutes from the letters 
a a word “generation.” The narrator demon- 
Re the anagrams game with the word “fed- 

al,” by constructing “flare,” “lead,” and “deer” 
48 examples. Only those words found in the 

tionary were allowed. 


Resuuts anp Discussion 


a a number of correct responses 
ae of the experimental tasks is pre- 
ees in Table 1. None of the tasks ap- 
Ame to be inappropriately easy or diffi- 
withinnn the seventh-grade Ss, for the 
Miike ogee variability was reasonably 
ie a each of the tasks, In general, there 
Bal AE pt sex differences in either 
inci as Performance or variability. In 
Rea, al learning, however, girls made a 
sDonse ‘antly greater number of correct re- 
i 8 than boys (¢ = 3.28, df = 89,p < 
differs espite the lack of significant sex 

‘Aces in average level of performance, 


Sub; 
Sequent analyses were performed sepa- 


TABLE 1 


Mean Numer or Correct Responses ror Eacu 
Task Accorpine To Sex or Supszcr 


Boys Girls 
Task 

M | sp | mu | sp 

PA (Abstract words] 34,86 | 11.76 | 38.76 56 
PA (abstract forms). 33.53 | 12.33 | 36.45 | 10.69 
Discrimination learning 33.82 | 14.22 | 33.08 | 13.56 
Seewyeatny || tel | Ee 
tion 3.23 | 1:15 | 3.41 | | '89 
Verbal memory 7.86 | 3.83 80 | 3.19 
Anagrams 18.68 | 7.81 | 22.02 | 8,08 


Note.—N (Range) = 39-48 for boys, 39-46 for girls. 


rately for boys and girls because of the 
possibility that sex differences might appear 
in the patterns of relations. 

The significant correlations between per- 
formance on the experimental tasks and 
school grades are presented in Table 2. 
(The degrees of freedom in this and sub- 
sequent tables vary for different correla- 
tions depending upon the number of Ss for 
whom information was available.) The 
most consistent correlations were found 
between grades and the two forms of 
paired-associate learning and discrimina- 
tion learning. These correlations did not 
differ notably among the various courses. 

There were pervasive sex differences in 
the types of relations found for the remain- 
ing tasks. The incidental learning and con- 
cept of probability tasks were significantly 
related to grades in all but one instance 
for boys, but in no case were they signifi- 
cantly related to girls’ grades. On the other 
hand, the conservation task, verbal mem- 
ory, and anagrams were consistently related 
to the grades of girls, but not of boys. 
There is no obvious basis for interpreting 
these differences. Nevertheless, the results 
offer clear evidence that the behavior 
sampled in brief experimental tasks is sig- 
nificantly related to long-term classroom 
performance. The processes involved | in 
many of the tasks are apparently similar 
to those required for success 10 the class- 
room, even though the mode of presenta- 
tion and the materials are highly dissimilar 
in the two situations. ‘ 
the correlations of school grades with 1Q 
and with performance on the experimental 


tasks are presented in Table 3. The correla- 


TABLE 2 


CorrELatTion or ScHoot GRADES WITH 
PERFORMANCE ON EXPERIMENTAL 


Tasks 
Task |English 
BA (Abstract ores) 9° 
PA (Abstract forms) .63°* 
Discrimination learning 35° 
Incidental learning 35° 
Concept of probability 40°" 
Conservation rh 
Verbal memory es 
PA tein ir 33° 
PA (Abstract forms) ATS 
Discrimination 497° 
¢ at bability 
beeen deh 62°° 
Verbal memory .65°* 
61°" 
* p< 05. 
“poo 


tions of grades and IQ were considerably 
higher for girls than for boys. Performance 
in the experimental tasks also tended to be 
highly correlated with IQ; however, there 
were no sex differences in the magnitude of 
the correlations. 

To determine the relative influence of 
intellectual level on the relation between 
performance in the experimental tasks and 
school grades, correlations were computed 
between the latter two variables with the 


TABLE 3 
Stantricant CORRELATIONS BETWEEN LoRGE- 
THornvixs IQ (VurBaL), Schoon Grapzs, 
AND PERFORMANCE ON EXPERIMENTAL 


TASKS 
Variable Boys Girls 

School Grades 

English -56** | .76** 

Social studies -48%* | .82** 

Science <65%* | .75** 

Mathematics -89** | .68** 
Experimental Tasks 

PA (Abstract words) -46** | .49** 

PA (Abstract forms) .37* .53** 

Discrimination learning .40** 

Incidental learning .34* 

Concept of probability .53%* 

Conservation -48** 

Verbal memory -56** | .61** 

Anagrams -61** | .60** 


ERIS SEE CA SISO) 
*p < .05. 
*p < 01. 
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contribution of IQ partialed out. The re 
sults, summarized in Table 4, reveal strik-. 
ing differences for boys and girls. When 
these results are compared with those of 
Table 2, it can be seen that the number of 
significant correlations decreased from 21 
to 13 for boys and from 23 to 5 for girls, 
Intellectual level thus tended to be a com- 
mon determinant in the two situations of 
the performance of girls, but not of boys. 
After IQ was partialed out, the magnitude 
of the correlations remained high for boys, 
and in the conservation task became sig- 
nificant for science and mathematics, In 
three instances the correlations when IQ 
was partialed out were negative, twice for 
girls in social studies and once for boys in 
science. Performance of boys, both in the 
experimental tasks and in school, is ap- 
parently highly dependent upon factors 
other than intellectual level. The present. 
data provide no indication of what such: 
factors might be; however, it is likely that 
they involve differences in motivational 
and personality characteristics. Many stud- 
ies have found, for example, that such chat- 
acteristics as level of anxiety and achieve 
ment motivation tend to be more important 
determinants of the performance of males 
than of females. 


TABLE 4 


Sranrricant CoRRELATIONS BETWEEN 
PERFORMANCE ON THE EXPERI- 
MENTAL TASKS AND ScHOOL 
Graves (IQ PartTIALED 
Out) 


Task English | Socis! | Science 


34° 65°" 


52°" 


.31* 
Tncidental learning ee 
Concept of probability 

Conservation 
Verbal memory 
Anagrams 


Girls 
PA (Abstract words) . 
PA (Abstract forms) 34 
Discrimination - 


45e* 


Boys 
PA (Abstract words) 
165° 


PA (Abstract forms) 
Discrimination 


idental i: ae 

Concept of probability ele 
ition 

Verbal memory 
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TABLE 5 
SraniFIcANT CORRELATIONS BETWEEN 
PERFORMANCE ON First TRiaL 
Brock or EXPERIMENTAL 
Tasks AND ScHOoL 


GRADES 
Task English | Social | Science | Math 
BOR (Abstract words) .7* | 3a 462. | .a7ee 
PA (abstract Torms) ‘ager | legee | le7ee | cezee 
Discrimination learning 34° 
PA (Abstract words) 
PA (Abstract forms). 39° -4g°° 


Discrimination learning 


“p< 05. 

p<. 

The sensitivity of the paired-associate 
and discrimination learning tasks as pre- 
dictors of school grades was determined by 
correlating the mean number of correct 
responses during the first block of trials 
with grades. Only on these tasks could such 
correlations be computed, since the other 
tasks did not involve repetitive trials. The 
significant correlations are presented in 
Table 5. Performance on the first block of 
trials in the paired-associate tasks was 
temarkably sensitive in predicting the 
grades of boys. This is especially notable 
since the first block of trials represented 
less than a 3-minute sample of behavior. 
The factors producing the sex differences in 
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the relations between early performance in 
the paired-associate tasks and grades are 
not clear. Whatever they are, however, was 
evident to the boys’ teachers, for the teach- 
ers’ ratings of effectiveness of learning were 
more highly related to performance on Trial 
Block 1 of the two paired-associate tasks 
for boys (r = .37 and .67) than for girls 
(r = .20 and .31). 

Correlations were also computed between 
performance on the experimental tasks and 
scores on the Iowa Tests of Basic Skills. 
The patterns of relations were highly simi- 
lar to those found for school grades, thus 
the data are not presented in detail. Again, 
approximately equal numbers of significant 
correlations were found for boys and girls, 
but when IQ was partialed out the decrease 
in the number of significant correlations 
was much greater for girls than for boys. 

Each teacher was asked to divide the 
class into five approximately equal-sized 
groups according to the student’s general 
learning ability. The correlations between 
these ratings and performance in the ex- 
perimental tasks were significant for girls 
in all tasks except incidental learning and 
concept of probability. For boys, significant 
correlations were found for the two paired- 
associate tasks, incidental learning, and 
verbal memory. 


TABLE 6 


Stantricanr CoRRELATIONS BETWEEN GRADES IN GRADE 7 AND PERFORMA\ 


NoE AT GRADE 6 WITH 


AND WirTHour IQ PartraLep OvT 


Task : : S 
Boys 
PA (Abstract words) -65** (.55**) pean id ui) 
Incidental learning 3" ‘ 
Concept of probability is 
eae e ° 
zaerrton pe .64** (.44’ * ee (,46* 
a “er (.434) .77** (.58**) .73* 5o** (.46*) 
PA (Abstract words) -69%* (,62%*) i we 
Theidental learning : ie 
‘Oncept of probability 608" ae : , 
Conservation Be 
erbal memory -40* ‘oon i 
hale so es) .62** (.44**) .56' 


“p< 01. 


wotes— Values in parentheses are correlations between variables with IQ partialed out. 
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Additional information. Data were also 
available for 35 boys and 38 girls for whom 
course grades were obtained for the first 
quarter of their seventh grade at the same 
junior high school attended by the other Ss 
in this study. (Grades are not given in the 
Minneapolis school system until the seventh 
grade.) These Ss were given all of the tasks 
discussed earlier except paired associates 
(abstract forms) and discrimination learn- 
ing when they were in the sixth grade, 7 
months prior to the assignment of grades. 

The significant correlations between per- 
formance in the experimental tasks and 
school grades are presented in Table 6. The 
correlations are of approximately the same 
magnitude as those found for the seventh- 
grade Ss. The correlations between perform- 
ance and grades with IQ partialed out are 
presented in parentheses in Table 6. Al- 
though the particular correlations that were 
significant differed somewhat from those 
found for the seventh-grade Ss, the tend- 
eney again was for the proportion of sig- 
nificant correlations remaining after IQ 
was partialed out to be greater for boys 
than for girls. Correlations between per- 
formance on the first block of trials in 
paired associate (abstract words) and 
grades in the four seventh-grade courses 
were significant for boys (r > .38) except 
for science, but none of the correlations was 
significant for girls. 

The study began with the general ques- 
tion of ascertaining whether the experi- 
mental tasks used by child psychologists in 
studies of learning and problem solving 
have any relation to the long-term learn- 
ing that occurs in school. The answer to 
the question proved to be more complicated 
than was anticipated, for the predictive 
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validity of many of the tasks differed, de- 
pending upon the sex of S. There was no 
tendency for the more complex, and pos. 
sibly more school-like tasks to produce 
higher relations with grades than the 
simple rote-learning tasks. In fact, it is 
of interest that the two tasks that yielded 
the most consistently significant correla- 
tions, paired-associate learning and dis- 
crimination learning, were presented in the 
most highly structured context and used 
highly artificial materials. 

Many questions remain unresolved in this 
preliminary study. The most important of 
these concern (a) the bases of the greater 
importance of intellectual factors for girls 
than for boys in producing significant cor- 
relations between performance in the ex- 
perimental tasks and school grades, and 
(b) the stronger relation for boys than for 
girls between performance in the early 
phases of paired-associate learning and 
school grades. There is a surprising paucity 
of information helpful in answering such 
questions. Further investigations should; 
however, not only increase our understand- 
ing of the correlates of the learning process 
in children, but also might yield new and 
effective means of predicting children’s 
academic success. 
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ROLE OF IRRELEVANT CUES IN THE FORMATION 
OF CONCEPTS BY LOWER-CLASS CHILDREN: 


information processing. 


Th recent years, there has been in- 
creasing theoretical and empirical interest 
in children’s conceptual performances, and 
most: investigations have reported positive 
telationships between age (chronological 
and mental) and degree of conceptual 
proficiency (eg., Long, 1940; Long & 
Welch, 1941; Osler & Fivel, 1961; Piaget, 
1930; Sigel, 1953). The direction of this 
developmental relationship is not invari- 
ant, however, as attested to by some recent 
‘akg which demonstrated that older chil- 
ten may be inferior to younger children 
‘ €n more complex stimuli are employed 
ine. Friedman, 1965; Klugh & Roehl, 
ed Osler & Kofsky, 1965; Osler 

Tautman, 1961). 
ouch it is clear that developmental 
fits 10ns exist in children’s conceptual 
eae the issue of what factors ac- 
ee such differences has not elicited 
Te agreement. Two theoretical posi- 
this ae been particularly prominent in 
Phaost ea. The first position, advanced by 

Pia, and his coworkers (e.g., Inhelder 
Badger 1958), stresses the role of un- 
ving * Pe a naonal changes in under- 
Bey lve structures. A second posi- 


phi 
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The present study sought to assess differences in concept formati 

abilities of children at varying developmental stages. The aoe 
task employed was a perceptually oriented one, with stimulus di- 
mensions verbalized in advance and positive exemplars of the con- 
cept continuously visible to the children. Results obtained on Ist-, 
8rd-, and 5th-grade Negro lower-class males revealed differences in 
concept ability associated with both chronological age and IQ scores. 
Increasing the amount of irrelevant stimulus information elicited 
more errors in all age groups but this variable did not interact sig- 
nificantly with developmental level. The response latencies of the 
older and brightest Ss increased with the number of irrelevant cues, 
whereas those of the less intelligent children did not. This finding was 
interpreted as suggestive of possible developmental differences in 


tion, postulated by Kendler and Kendler 
(1959), asserts that developmental differ- 
ences in conceptual functioning are attrib- 
utable to the advanced capacity for verbal 
mediation associated with higher matura- 
tional levels. Neither of these two view- 
points readily yields predictions of a nega- 
tive relation between age and cognitive 
performance. 

The present study attempted to assess 
an alternative explanation of develop- 
mental differences, derivable from an in- 
formation theory approach to concept for- 
mation (Bourne & Restle, 1959; Hovland, 
1952). This position appears capable of 
accounting for negative relationships pre- 
viously obtained. Within this viewpoint, 
the parameter of primary concern is the 
informational value of the stimuli to the 
subject (S). Systematic addition of irrele- 
vant stimulus information increases the 
difficulty of concept problems for adults, in 
terms of the number of errors and time re- 
quired for solution (Archer, Bourne, & 
Brown, 1955). The possibility suggests it- 
self that the same relation between stim- 
ulus complexity and concept difficulty exists 
in older but not in younger children because 
of a differential capacity for processing 
complex stimulus information. Thus, older 
children may take longer than younger chil- 
dren to solve certain kinds of conceptual 
problems, and may actually find them 
more difficult than younger children who 
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are not as aware of the complexity. The 
purpose of the present study was to assess 
this possibility. 

The concept formation task employed 
in the present investigation was a percep- 
tually oriented one which did not involve 
memory or learning. Positive instances of 
the concept were continuously visible to 
the child, and all stimulus dimensions were 
verbalized prior to the concept task. These 
procedures were employed in order to 
minimize the role of variables other than 
conceptual ability which might contribute 
to age variation, such as stimulus famil- 
jarity, availability of labels and differences 
in recall. 

The expectation in the present study was 
that developmental differences in concep- 
tual performance would be obtained even 
when such possible confounding variables 
were controlled, and, furthermore, that in- 
formation processing strategies should dif- 
fer with developmental level. More specifi- 
eally, it is predicted that the older and 
more intelligent the child, the more closely 
his performance should approximate that of 
adults. In accordance with Bourne and 
Restle’s model (1959), it was predicted 
that task difficulty (as measured by both 
the number of errors and the latency of 
response) should show a more consistent 
relation to the amount of irrelevant stim- 
ca information for the high developmental 

s, 
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Subjects 


The Ss were 72 Negro male children drawn 
from the first-, third-, and fifth-grade classes of two 
elementary schools in the Harlem area of New 
York City. The mean chronological ages of these 
groups were 6 years 10 months, 9 years 2 months, 
and ll years 2 months, respectively, with corre- 
sponding standard deviations of 35, 65 and 5.9 
months. These children had all participated in a 
previous study concerned with perceptual corre- 
lates of reading disability (cf. Katz and Deutsch, 
1963), and the experimenters we: , therefore, 
familiar to all Ss. The children were primarily 
from lower socioeconomic backgrounds as assessed 
by parental questionnaires. Although this type 
of group has been characterized by a Telatively 
high incidence of academic failure, there appears 
to be a paucity of data available on the con- 
ceptual abilities of such children, which was the 
major reason for the choice of these Ss. 
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At each grade level, Ss were dichotomized into 
high and low intelligence groups on the basis of 
Lorge-Thorndike IQ Test Scores (Lorge & Thom- 
dike, 1959) and children were randomly chosen 
from these groups. The mean IQs of the high and 
low groups were 108.5 and 83, respectively. 

The response latency scores of these children 
were compared with those obtained from a group 
of eight psychology graduate students with a mean 
chronological age of 24.3. Although the task was 
obviously very simple for this latter group, it was 
felt that there response latency patterns would 
provide an interesting point of comparison. 


Stimulus Materials 


The stimuli on which the concepts were based 
were geometric shapes which differed along four 
dimensions, each with three easily discriminable 
attributes that is, form (circle, square, or equi- 
lateral triangle), color (red, blue, or yellow), 
height (% inch, 1 inch, or 12 inches), and number 
(one, two, or three). These stimuli were pretested 
on comparable groups of kindergarten children 
to assess the discriminability of the concept 
exemplars. The shapes were cut from gummed 
paper and centrally placed on 3 X 5 inch white 
index cards. These index cards were then pasted 
into two loose-leafed booklets made of black 
construction paper, three to a page. One booklet 
contained three positive instances of each con- 
cept. The other booklet contained three choices 
for each concept. The S’s task was to point to the 
choice that was like all the positive instances 
There was, of course, only one correct choice for 
each problem, and the position of this correct 
choice was randomly varied among the left, mid- 
dle, and right index cards. i 

The concepts employed varied in two waysi 
(a) the type of cue relevant to solution (color, 
form, number, or size), and (b) the number 
irrelevant stimulus cues (one, two, or three). 10 
accordance with Archer, Bourne, and Hone 
(1955), an irrelevant cue was defined as one whic 
varied in the positive exemplars of a concep’ 
but was irrelevant to the correct solution. ae 
8s received 12 concept problems, each with om 
cue relevant to solution. The first four problems 
contained one irrelevant cue, the second fone : 
two irrelevant cues, and the last four had a 
irrelevant cues. Thus, there were three bets 
levels of stimulus complexity. At each leve 
stimulus complexity, there was one concept aa 
on form, number, size, and color. These fe 
randomly varied at each complexity level. a 
of presentation was the same for all Ss, name» 


four concepts each with one, two, am 
irrelevant cues. 
Procedure 

‘a school. 


Each S was tested individually in th 
Upon entering the experimental room, © i 
informed that he would be playing oerpoked 
which he was to pick out pictures that 


— . 


number, size, and color. 
elicited following the child’s choices, and verbal 
reinforcement (i.e., “good,” “that’s right”) fol- 
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like each other. Two index cards containing 
identical red circles were then presented to 8, and 
he was given the following instructions: 


Here are two pictures. Do they look like each 
other? Why? [An attempt was made to elicit 
correct verbalizations of both dimensions,] Here 
are some other pictures. [Three additional index 
cards were introduced]. Can you pick out one 
picture from here that is like these two? 


If 8 made the correct choice, the experimenter 
(Z) said “good” and asked the child “Why is 
this one like these two?” If the child was unable 
to supply the correct verbalization, EZ provided 
it. Four other examples of this type were then 
introduced which utilized the cues of shape, 
Verbalizations were 


lowed each correct response and verbalization. 
Most of the children were able to verbalize 
correctly the relevant dimension. When they could 
not, supplied it, and asked S to repeat it, 

Following this verbalization training, the book- 
ae were introduced with the following instruc- 
ons: 


Now we're going to do some more. This time 
the pictures will be in these books. In this 
book, all the pictures on the page are alike 
mm some way. Look at them very carefully to 
see in what way they are alike. Now I’m going 
to show you some other pictures in this book. 
I want you to point to the picture in this book 
[choices] that goes with these [positive in- 
stances]. Point to the one picture here that is 
like all of these. 


petiew sample items of this type were admin- 
mul, using concepts with two relevant redundant 
a Us cues and no irrelevant information. These 
Paes to insure that S$ understood the in- 
Rites ne, Following these examples, the twelve 
mee were introduced and S was told 
“iN ould no longer tell him if he were right or 
Dut would tell him how well he did at the end 
ace: Both choice scores and latency 
te by © recorded for each S. Latency was meas- 
“ wy means of a stopwatch which was begun by 
en nas the positive exemplars were visible to 
stopped when S made a pointing response. 


REsutTs 


Eflect of Type of Relevant Cue 


made te! number of correct responses 
Rica each of the four types of relevant 
am § cues was analyzed by means of & 
(Lindgui ee-way analysis of variance 
of ea ust, 1953). The main effect of type 

mulus cue was not significant (F = 
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TABLE 1 


Mean Nomper or Correct Cuorczs oF 
Eacu Group 


‘Number of Irrelevant Cues 


Group 


Fit grade—low 19) 


Fifth grade—low I 
Fifth grade—l 
Total nee 


Be Sh 


Ergon goto sone 
BBR AR as 
Serpe re 
ssa 


1.73, df = 3/198), nor did it significantly 
interact with age or 1Q, The main effects 
of age and IQ were statistically significant 
at the .05 level (F = 3.47, df = 2/66 and 
F = 5.56, df = 1/66 respectively). The age 
differences indicate that more correct re- 
sponses on all items are associated with 
the older children. The mean number cor- 
rect for the first-, third-, and fifth-grade 
groups were 7.29, 7.67, and 8.41, respec- 
tively. As expected, the brighter children 
made more correct responses than their 
less intelligent peers, a mean of 8.28 as 
compared with 7.28. 


Effect of Stimulus Complexity 

The mean number of correct concept 
choices associated with each level of stim- 
ulus complexity for the various groups 
are presented in Table 1. 

A mixed three-way analysis of variance 
(Lindquist, 1953) conducted on these scores 
revealed the main effects of age (F = 3.42,, 
df = 2/66), 1Q (F = 6.33, df = 1/66) ; 
and level of stimulus’ complexity (F li 
93.01, df = 2/132) to be statistically sig-) 
nificant. ‘The effects of age and 1Q on the’ 
number of correct choices were described 
above, i.¢., the older and brighter the child, 
the more correct responses. The significant 
F. value associated with level of stimulus: 
complexity indicates that the addition of 
irrelevant cues increased the difficulty of 
the concept problem for these children. 
None of the interaction effects was sig- 
nificant, and it can be: seen from Table 1 
that the amount of irrelevant stimulus in- 
formation showed approximately the same 
relation to performance (as measured in 
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TABLE 2 
Maan Response Latencins oF Eacn Group 


‘Number of Irrelevant Cues 


Group 
One Two Three 
First grade—low I 41.8 33.2 24.5 
Fat wate tien Ne 37.0 40.4 29.2 
Third grade—low Ii 27.5 34.2 32.2 
‘Third grade—high % 21.7 30.1 29.8 
Fifth grade—low I 26.6 30.7 29.8 
Fifth grade—bigh % 21.5 27.8 39.7 
Adult reference group 15.5 20.4 87.7 


terms of number of correct responses) at 
all age levels tested. 

. A second measure obtained on all the 
children was the latency of response to each 
problem. The mean latencies of each group 
at the various levels of stimulus complexity 
are presented in Table 2. Also included in 
Table 2 are the mean latencies of an adult 
reference group of eight psychology grad- 
uate students, which was included for pur- 
poses of visual comparison. The scores of 
this latter group were not included in any 
of the statistical analyses presented (al- 
though individual F tests revealed that 
their pattern did differ significantly from 
all of the children’s groups). 

The latency scores of the first-, third-, 


GRADE? O---O 
GRADES 


GRADES O-—: 
aoe 


ADULT 


TIME IN SECONDS 


1 2 3 
NUMBER OF IRRELEVANT CUES 


Fia. 1. Response latencies of each 
on concept problems of varying come od 
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and fifth-grade children were analyzed by 
means of a three-way analysis of variance 
which indicates that none of the main ef. 
fects was statistically significant. Thus, 
total response time was not associated with 
age, IQ, or stimulus complexity. Stimulus 
complexity, however, did interact signifi- 
cantly with both age (F = 3.85, df = 
4/132) and intelligence level (F = 4.5, 
df = 2/132). The specific direction of the 
Stimulus Complexity x Age interaction can 
be observed in Figure 1. 

It can be noted that the response times 
of the adult reference group on this task 
are directly proportional to the amount of 
irrelevant stimulus information contained 
in the concept problems. The various age 
groups tested, however, each exhibited dif- 
ferential patterns of response latencies. 
The fifth-grade children in the present 
study exhibited a pattern of responses 
similar to the adults, that is, their mean 
latencies increased with the addition of 
irrelevant cues. The youngest group, 00 
the other hand, exhibited an opposite pat- 
tern with regard to latency measures. 
Response times of the first-grade group 
decreased as the problems became more 
difficult. The performance of the third- 
grade children was somewhere between 
these two extremes. They exhibited an iD- 
crease in reaction time between the simpler 
(one irrelevant cue) and intermediate (two 
irrelevant cues) complexity problems, but 
not between the intermediate and difficult 
(three irrelevant cues) problems. 

The means involved in the significant 
Stimulus Complexity x IQ interaction 
parallel the Stimulus Complexity x Age 
interaction. The more intelligent children 
at all age levels tended to approximate 
the adults’ performance more closely. The 
low IQ children, on the other hand, & 
hibited a different pattern of response 
one which was more similar to the young® 
children tested. 


Discussion 
The results of the present investigatio® 
indicate that proficiency in concept form? 


tion is related to developmental level 
children, as assessed by both chronolog} 
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age and 1Q. This finding is in accordance 
with a number of other studies (eg., In- 
helder & Piaget, 1958; Long & Welch, 
1941; Osler & Fivel, 1961; Sigel, 1953). 
Unlike earlier studies, however, the present 
investigation attempted to minimize several 
other possible sources of age variation 
which may not be specifically related to 
conceptual capacity, such as memory, stim- 
ulus familiarity, and the availability of 
verbal labels. Toward this end, the con- 
ceptual task employed contained readily 
discriminable perceptual dimensions, with 
explicit. verbalization, and continuously 
visible positive exemplars. The finding that 
developmental differences continue to mani- 
fest themselves in a task of this type sup- 
ports the developmentalist view that gross 
qualitative differences in concept formation 
ability exist at varying maturational levels. 
One of the interests of the present study 
was to assess possible differences in infor- 
mation processing strategies in children’s 
conceptual performance. The present find- 
Ings suggest that the parameter of irrele- 
vant stimulus information is very signifi- 
cantly related to concept problem difficulty 
in children. The number of correct re- 
§ponses was inversely related to the level 
of stimulus complexity at each of the age 
levels tested. This finding is in accordance 
with results obtained with both college 
students (Archer et al., 1955) and with 
younger, somewhat more intelligent and 
higher social-class children in a concept- 
leaming task (Osler & Kofsky, 1965). 
b ere was, however, a lack of interaction 
etween stimulus complexity and develop- 
Mental level on the number-of-correct- 
Tesponses measure, which was not in ac- 
cordance with expectation. 
ea finding that was suggestive of dif- 
ential strategies was the significant in- 
on of stimulus complexity with both 
fe and IQ with regard to response laten- 
Tesn An exponential relationship between 
ee latency and stimulus complexity 
te eited by the group of graduate stu- 
Sm, in the present study, a finding very 
(1955) to that obtained by Archer et al. 
finch with college students employing 
More complex conceptual tasks. In 
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general, the older and brighter the child, 
the more closely his reaction-time pat- 
tern approximated that of the adult group. 
Thus, the fifth-grade, high-IQ Ss showed 
the clearest increase in reaction time as 
additional irrelevant cues were introduced. 
The first-grade children, on the other hand, 
exhibited an opposite pattern. Their re- 
sponse times decreased somewhat as the 
number of irrelevant cues increased. This 
trend was particularly pronounced in the 
youngest children with the lowest IQs. 

Although other interpretations are pos- 
sible, the differential relation between re- 
sponse latency and stimulus complexity at 
the various developmental levels assessed 
in the present investigation suggests that 
the older and more intelligent children 
may have a greater capacity for processing 
more stimulus information than children at 
less advanced cognitive stage. The work of 
Kagan (1965) and his associates on re- 
flective and impulsive cognitive styles ap- 
pears relevant to the present interpretation 
since, as Kagan suggests, it is possible that 
longer response times are indicative of a 
general reflective mode. It should be noted, 
however, that differences between age and 
IQ groups in overall reaction times were 
not obtained in the present investigation. 
Longer response latencies were exhibited 
by the more intelligent children only with 
regard to the more complex stimuli, thus 
suggesting that reflection was not a general 
response characteristic but rather one that 
was appropriately related to the stimulus 
characteristics of the task. 

The type of relevant stimulus cue em- 
ployed was not differentially related to 
the difficulty of the concept problem. This 
finding, although in accordance with adult 
studies (e.g., Archer et al., 1955), appears 
to run counter to a number of studies show- 
ing that certain stimulus cues may be more 
salient than others in the perceptual and 
cognitive processes of younger children 
(e.g., Brian & Goodenough, 1929; Corah, 
1966; Kagan & Lemkin, 1961). Unlike these 
earlier studies, however, the procedure em- 
ployed in the present study introduced the 
relevant stimulus dimensions together with 
appropriate labels to S prior to the con- 


238 


cept task. It would appear then that the 
perceptual trends observed in earlier find- 
ings may be easily changed by appropriate 
verbal instructions. In this regard, it is 
interesting to note that the absence of 
cue differences was obtained in the present 
investigation with a group of lower-class 
children who would be least likely to be 
very familiar with the type of stimuli em- 
ployed. Another possible explanation which 
suggests itself for the discrepancy between 
present results and earlier findings is that 
the previously obtained dominance of cer- 
tain stimulus cues in young children may 
have been reflecting differential availability 
of labels for these dimensions, rather than 
developmental differences in perceptual 
processes. Future investigation is indicated 
to assess the possibility. 
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263 7th-grade public school children were tested to determine 
whether quiet (45-55 db.), average (55-70 db.), and noisy (75-90 db.) 
classroom and experimental conditions had a relationship to written 
task performance of relatively short duration. It was hypothesized that 
Ss would perform better under quiet than under average and noisy 
conditions and that boys would be more detrimentally affected by 
noise than girls. Noise typical of that experienced in schools and 
white noise were both used. Means and standard deviations were 
compared across the conditions used and analyses of variance were 
performed on the data. No noise effect, either detrimental or facili- 
tating, was demonstrated on speed or on accuracy of performance. 
Ss’ perceptions of the effects of noise and measured anxiety had little 
relationship to actual performance. 


_ Within the past several decades interest 
in improving educational facilities has 
seen a steady increase. One of the areas 
around which much controversy has arisen 
is that of acoustical environment in the 
schools. In order to delineate a clear-cut 
area for investigation, only the specific 
category of the effect of noise upon writ- 
ten task performance of children was 
treated in the present study. 


PROBLEM 


The effects of noise upon human per- 
formance has been an area of conflicting 
‘aa and research studies for over four 
a Researchers such as Broadbent 
ie 1954, 1958a, 1958b), Grimaldi 
‘e ), Jerison (1959), Kitamura (1964), 
; ae Creswell, and Huffman (1965), 
ni Weston and Adams (1935) have re- 
pees detrimental effects of noise upon 
ee Ormance. Other researchers, such as 
a Braasch, and Shay (1947), Park 
“i a (1963), Sanders (1961), Teich- 
E> ects and Reilly (1963), and Tinker 

This paper is based on a doctoral dissertation 
Tated to the Joint Committee on Graduate 
eee Columbia University. Committee 
aes ee Mary Alice White (Chairman), 

i fe ieberfreund, and Walter H. MacGinitie. 
cee aaa teported herein was perfo: 
Partment of eal Wed has? Lapel 9 
Office of Educati meee 
Coopenatie ate under the provisions of the 
i in author 
Versity, 


arch program. 

: was a graduate student in school 
ik at Teachers College, Columbia Uni- 
at the time the research was con 


(1925), have reported either equivocal re- 
sults or no evidence of a detrimental noise 
effect. 

Much of the existing research has used 
adult subjects (Ss), artificial conditions, 
and/or noise levels exceeding those encoun- 
tered in any but extreme situations. Plan- 
ning of educational facilities based upon 
these studies, without evidence of compara- 
bility of Ss or conditions may be some- 
what misleading. The conflicting results of 
studies pertinent to educational applica- 
tion further indicate the need for additional 
research to assist in determining the need 
for and value of acoustical treatment in 
schools. 

The major objective of this study was to 
investigate the effects of noise upon writ- 
ten performance in a realistic environmen- 
tal setting while meeting as many of the 
requirements of noise research as possi- 
ple. Such an application necessitated the 
use of an actual classroom environment, 
tasks pertinent to school routine, and 
noise comparable to that encountered by 
children during school activities. While it 
might be interesting to prove that noise of 
approximately 100 decibels transmitted, 
via earphones, to children working in 
isolation chambers caused a deterioration 
in performance, it would be rather difficult 
to generalize to children who are not sub- 
jected to this level of noise, who are not 
equipped with earphones, and who do not 
work in isolation chambers, It was also 
necessary to control for or to take into ac- 
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count physical variables, differences in set 
caused by instructions and task percep- 
tions, individual differences, the so-called 
Hawthorne effect of a positive change re- 
sulting from any alteration of conditions, 
and the carry-over effect from one condi- 
tion to another. The noise characteristics 
and instrumentation used had to be de- 
scribed for purposes of analysis and pos- 
sible replication, 

A secondary objective was to investigate 
individual differences under conditions of 
noise. According to Goodenough (1954, pp. 
482-483), girls achieve slightly better in 
school, possibly because of girls’ greater 
docility and better application to studies. 
Terman and Tyler (1954, pp. 1064-1114) 
cited studies indicating that boys may be 
more physically active and restless than 
girls. Considering this, it was expected that 
boys would be less motivated to work than 
girls and would be more prone to distrac- 
tion under conditions of noise. 

In line with Sarason’s (Sarason, David- 
son, Lighthall, Waite, & Ruebush, 1960; 
Sarason & Gordon, 1958) work on anxiety 
in children, it was decided to investigate 
the relationship between anxiety and the 
effects of noise. The Ss’ perceptions of noise 
and the effects of set toward noise and 
toward the experiment were also considered 
as possible secondary factors. 

Noise was defined, consistent with Peter- 
son and Gross (1963), as undesired sound, 
Since the normal noise encountered by 
children in school is intermittent or irregu- 
lar-interval noise, the study used this type 
of noise. 


Hypotheses 


Hypothesis 1. Under conditions of irreg- 
ular-interval noise varying approximately 
75-90 decibels, children’s task performance 
will be lower than under conditions of rela- 
tive quiet of approximately 45-55 decibels. 

Hypothesis 2. Under conditions of irreg- 
ular-interval noise at levels of 75-90 
decibels, children’s task performance will 
be lower Lm: under conditions of normal 
or average classroom noise of i- 
mately 55-70 decibels. as 

Hypothesis 8. Under conditions of rela- 
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tive quiet, children’s task performance will 
be higher than under conditions of nor. 
mal classroom noise. 

Hypothesis 4. Performance levels of boys 
will be lower than performance levels of 
girls under conditions of irregular-interyal 
noise of 75-90 decibels. 


MerHop 


Subjects 


The Ss were 129 male and 134 female seventh- 
grade children from a centralized suburban school 
on the outskirts of a small urban complex, con- 
sisting of three cities, in south-central New York 
state. No children with hearing deficiencies were 
included. 


Task 


The STEP Reading Test, Form 3 was used as 
the written task. The Ss were permitted to answer 
as many of the 70 questions as they were able 
during the 30 minute testing period in order to 
prevent a ceiling effect. Two experimental sets 
were assumed; the tension of a test situation and 
the more relaxed atmosphere of a homework situ- 
ation. These assumed sets were induced through 
differences in answer shects, instructions, and 
examiner behavior, Measures of speed, as the 
total number of questions attempted, and of ace 
curacy, as the percentage correct of the number 
attempted, were obtained. 


Grouping 


Eight equated groups were used and each S was 
tested only once. Four days prior to the exper 
ment, Part 1 of an alternate form of the STEP 
Reading Test was given to each S. The total num- 
ber of correct answers on this pretest was use a8 
the essential basis of equating the eight groups 
Scores were arranged from high to low and & 
matching process ensured that the groups were 
comparable. 

The groups were also roughly equated, br 
matching, on the basis of IQ, socioeconomic Ls 
and achievement to ensure that no systems! 
differences among groups on these variables Oe 
curred and to permit examination of inate 
differences. Each group consisted of approxima! 
equal numbers of males and females. 4 

After completion of the matching proce 
testing condition was assigned randomly to Ae 
of the eight groups. The composition of 
groups is presented in Table 1. 


Noise Conditions 
Je methods 4 


A study was conducted of possib! : 
producing suitable noise conditions, a8 bese 
by past research and by surveys of typi ea 
levels found in schools, and of methods ©} 
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uring and analyzing such noise. Sound-pressure 
levels in decibels were used throughout the study. 
Typical noise levels from the sample school are 
presented in Table 2. These levels are comparable 
to levels reported by Fitzroy and Reid (1963) in 
an extensive survey of 37 schools. 

The hypothesized levels for the quiet (45-55 
decibels), average (55-70 decibels), and noisy 
(75-90 decibels) conditions were selected so as to 
avoid exceeding the minimum and maximum 
limits which might be anticipated to occur within 
a school environment. Noise characteristics of the 
experiment were as follows: 

Part 1: Classroom. The classroom section was 
divided into quiet, average, and noisy conditions. 
Only noise familiar in some degree to Ss was 
used and testing was done in classrooms with the 
usual row type seating. 

For the quiet condition (45-55 decibels) the 
test room was maintained in an isolated situation. 
The surrounding classrooms were empty; the 
corridors were kept free from passage; and bells, 
buszers, and intercom systems were temporarily 
eliminated. 

For the average noise condition (55-70 deci- 
bels), testing was done while classes were being 
conducted in both adjacent rooms and with nor- 
mal corridor traffic including student passage, 
Voices, and the noises of locker usage. 

For the noisy condition (75-90 decibels) three 
kinds of noise were used. The first was an external 
machinery noise created by having a tractor-run 
Power mower pass back and forth outside of the 
ee and curtained windows. The second was & 
ape Tecording of The Blitzkrieg played in rooms 
on both sides of the testing room, with the 
§peakers touching the walls. The third kind of 
ioe was human in nature and was created by 
in pele assistants working in the corridors and 
ah Jacent rooms. This noise consisted of run- 
i with metal tapped shoes, banging on and 
Pe ens wall lockers, talking, whistling, and 
an i banging on walls and blackboards, drag- 
a rire aud desks across the floors, and moving 

Bie on equipment through the corridors. 
ae Experimental. The experimental sec- 

i ie eet into quiet and noisy conditions. 

shield e tested on the stage, which was draped 

against extraneous noise. Prerecorded 


TABLE 1 
Tzstine Conprrions By GROUP 
iba bp Condition 
pe Situation Noise Task 
1 mewor! 
31 | Classr i 
2 ‘oom Kk 
3 | 37 | Ciastroom Quist | Text 
‘ 31 | Classroom Average | Test 
H $2 | Classroom ‘Average | Homework 
8 o Classroom Noisy Test 
7 | 2 | Classroom Noisy | Homework 
8 $3 | Experimental Quiet Test 
Experimental oisy Test 
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TABLE 2 
TyprcaL Noise Raneus or THe Sampin Scuoon 
Location Decibel range® 
Classroom 
Occupied—class silent 54-62 
Occupied—normal speech 60-72 
Unoccupied—school unoccupied 46-55 
Unoccupied—adjacent to band 72-86 
Unoccupied—adjacent to chorus 72-78 
Unoccupied—classes in adjacent 
rooms 52-58 
Art room—class in session 56-84 
Corridor 
During classes 49-58 
Between classes 68-89 
Study hall (small) 54-65 
Cafeteria during a lunch block 76-94 


Note.—Readings were taken over 10-minute 
time intervals and repeated at least once for each 
range given. A Model SS-375 Sound Spectrometer 
from the Industrial Acoustics Company set on 
Scale C (flat from 37.5-9600 cps) on fast speed was 
used. 
* Decibels re .0002 microbar. 


white noise issued from a central speaker with Ss 
seated around and equidistant from it. The quiet 
condition consisted of steady white noise of ap- 
proximately 50 decibels. The noisy condition con- 
sisted of phases of quiet white noise of approxi- 
mately 50 decibels alternated with phases of loud 
white noise of approximately 80 decibels. The 
quiet and noisy periods each consisted of a total 
of 15 minutes with intervals ranging 30-180 seconds 
and with a mean interval of 75 seconds, 

Tape recordings were made of each entire 
testing session, and measurements of the actual 
sound-pressure levels present during the testing 
sessions were taken every 60 seconds, providing 30 
measurements for each condition. Measurements 
were made with an Industrial Acoustics Company 
Model 88 375 Sound Spectrometer set on Scale C 
at fast speed and an H. H. Scott Type 450 Sound 
Survey Meter set on C weighting. The noise 
characteristics for each condition are presented in 


Table 3. 


Supplemental Data 


Following testing, the Sarason Test Anxiety 
Scale for Children (Sarason et al., 1960) and two 
questionnaires designed to assess Ss’ perceptions 
of noise and their awareness of the purpose of the 
experiment were administered to each S. The data 
from these instruments were used to determine 
the relationship between individual differences 
and the effects of noise. 


Testing Procedure 


The experiment was carried out during the first 
three periods of two consecutive days to avoid 
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TABLE 3 
Norse CHARACTERISTICS OF THE 
ActuaL EXPERIMENT 
iti Decibel 
Group Condition “i M SD 
‘lass! 
1 Cee omework | 4-88 | 54:8 | 1.8 
ae a 
i Avemse—Home- | 5072 | O47 | 4:7 
work 
Noisy—Test m1 | 82.6 | 4.7 
4 Noisy— Homework m0 | 820 | 46 
i Test 0-55 | 52.6 11 
8 Experimental 
it 
Ne Quiet | sto | sg.2 | 1.3 
Phase 2—Noisy 79-82 | 79.9 9 


the contamination of fatigue, which might have 
increased during the latter part of the day. Quiet 
and average classroom conditions were run on the 
first day to avoid feedback of information from 
Ss tested to those to be tested. This might have 
occurred, resulting in a noise set, had the condi- 
tions more obviously connected with noise been 
run first. The Ss were instructed that they were 
taking part in a reading project and that the in- 
struments were a means of timing the tests and 
later checking on the accuracy of timing and in- 
structions. 


Treatment of Data 


The major body of data was treated in three 
steps. The means and standard deviations for 
both speed and accuracy on the reading compre- 
hension test were computed. An F maximum test 
was then used to determine the feasibility of per- 
forming analyses of variance. The observed Fmax 
numbers were not within the critical regions. 
Therefore analyses of variances were carried out 
as the best method of answering the questions 
raised by the hypotheses. 

Data pertaining to the secondary questions 
were treated descriptively. The secondary ques- 
tions were those pertaining to individual differ- 
ences, such as the relationships between pupil 
perceptions and the effect of noise, or between 
anxiety and the effects of noise, and were not in- 
cluded in the hypotheses, 


Resuits 


Although there was a slight tendency for 
boys to work faster and to perform less 
accurately than girls under both of the un- 
familiar conditions of white noise, this 
tendency was of too small a magnitude to 
be of any practical value. There was no 
other trend indicating any effect of noise 
upon performance. 

The results of two-way analyses of vari- 
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ance performed on the data for speed and 
for accuracy indicated that there were no 
significant differences for condition, 8eX, OF 
interaction for either speed or accuracy, 
None of the hypotheses was supported in 
any degree by the data. Not only were 
there no significant differences, but there 
were no trends indicative of any noise ef- 
fect, detrimental or otherwise. 

While there were some minor individual 
differences, these were not consistent 
enough or of enough magnitude for prac- 
tical consideration. The Ss’ perceptions of 
the effects of noise upon their performance, 
the degree of noise which was present dur 
ing the experiment, and the annoyance 
value of noise had little relationship to ac- 
tual performance under the noise condi- 
tions used. Similarly, measured anxiety 
had little relationship to actual perform- 
ance, 


Discussion 


The major body of data, as treated by 
analysis of variance, was strong evidence 
against any effect of noise under the speci- 
fications of the experiment and upon the 
population used. According to existing lit- 
erature, if a noise effect had been demon- 
strated, it could have occurred in either 
the hypothesized detrimental direction of 
in the opposite direction of assisting pet 
formance. Neither effect was demonstrate , 

Consideration of the representativeness 
of the sample, the pertinence of the task to 
typical school behavior, and the ae 
bility of the conditions used to actual 
school conditions appeared to warrant gen’ 
eralization outward from the experim j 
to the public school population in genet. 
At the junior high school level, and pos- 
sibly at other grade levels, children’s test 
performance on written tasks, re 
reading comprehension, of the limited ; 
ration of a class period in length, 8 2 
affected either positively or negatively , 
the peaks of noise which are typical 0) 
normal school environment. _ bi 

The effects of noise over time, Me 
fects of noise upon learning, the effec ie 
noise upon tasks of a different nature, 
the effects of noise at levels above ™ 
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{ow those found in schools were not in- 
vestigated in the present study. 

The major portion of the present study 
was designed to examine the differences be- 
tween equated groups of children, rather 
than the interactions between individual 
§s and particular noise conditions. The 
reason for this choice was that practical 
school planning must consider children in 
groups. Since the data indicated that 
school children, as a whole, are not af- 
fected detrimentally by noise, it might now 
be of value to examine individual Ss tested 
under a variety of conditions. While sum- 
mary inspection of data did not indicate 
that the results were brought about by 
extremes of performance cancelling each 
other out, this does not preclude the pos- 
sibility that certain individuals might show 
test-retest changes. An experiment de- 
signed to examine this factor, if differences 
are demonstrated, might provide further in- 
sight into possible means of best handling 
_ particular children to facilitate the learn- 
ing process. 
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EFFECT OF QUESTION LOCATION, PACING, AND MODE UPON 


RETENTION OF PROSE MATERIAL 


LAWRENCE T. FRASE* 
University of Massachusetts 


A factorial design with 128 college Ss was used to study the effect 
of question location, question pacing, location of relevant content, 
and question mode upon the retention of question-relevant and inci- 
dental prose material. Retention was highest when questions were 
placed after paragraphs. Retention increased with the frequency 
of posttreatment questions, but it decreased with frequent pretreat~ 
ment questions. It was concluded that frequent postquestioning either 
shaped or elicited appropriate reading skills while frequent preques- 
tions interfered with prose structure. Frequent questioning, either 
pre- or posttreatment, yielded precise discrimination between relevant 
and incidental material which took the form of lowered incidental 
learning without a corresponding increase in relevant learning. 
Question mode (multiple-choice or constructed response) had no 
effect. Relevant material was retained better than the incidental, but 
incidental retention was relatively high if the incidental material fol- 


lowed the question-relevant material. 


Several studies (Frase, 1967; Rothkopf, 
1966; Rothkopf & Bisbicos, 1967) have 
shown that questions improve retention of 
both relevant and incidental material when 
they occur after the prose paragraphs to 
which they relate. The reason postques- 
tions work better than prequestions seems 
to be that they provide cues for the elicita- 
tion or shaping of efficient reading behav- 
iors. Postquestions serve more than a re- 
view function—they produce nonspecific 
facilitation of retention over the succeeding 
material. This study attempted to answer 
some questions raised by a previous study 
(Frase, 1967). 

Frase (1967), using different materials 
and Ss, replicated the results obtained by 
Rothkopf (1966) showing that the position 
of the questions and knowledge of results 
were both important factors in learning 
from prose. In addition, Frase varied the 
length of passages between questions and 
found that even though the total number 
of questions remained the same, the ef- 
fect of question pacing tended to be dif- 
ferent for retention of relevant material as 
opposed to incidental material, In ac- 
cordance with Ausubel’s (1963) position 
concerning meaningful verbal learning, the 
more the material was broken up by ques- 


* Now at Bell Telephone Laboratori 
Hill, New Jersey. lick diciad 


tions the lower the incidental learning. 
Introducing frequent questions, however, 
tended to improve retention of the relevant 
material, The results seemed to confirm 
both a small step approach, for specific re- 
tention, and a large step approach, for gen- 
eral retention. The design of that study 
precluded determining if the interaction 
between relevant-incidental retention and 
pacing of questions was significant. The 
present study was designed to obtain data 
on this interaction, and also to determine 
if there is an interaction between question 
pacing and the position of questions, 
that is, whether the skills developed by 
postquestions are more effectively shap 
or maintained as the frequency of questions 
increases. 

In the previous study (Frase, 1967) the 
relevant material was always located 
the second part of each paragraph. If ea 
tiguity of question and relevant materl@ 
is a critical factor for retention, then the 
occurrence of questions after passasts 
would have been most advantageous for 1 
tention of the question-relevant materia ; 
On the other hand, retention of rele 
material which is located in the first i 
of the paragraphs should be highest 
questions are placed before the pate ca 
The strength and direction of quest? 
content proximity was also explore 
present study. 
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d in the 


Rerention or Prose Marerrau 


In the present study the mode of ques- 
tions (multiple-choice or constructed re- 
sponse) used with the prose passages was 
also varied. Several authors have noted that 
the predictability of response class (Roth- 
kopf, 1965), difficulty of a stimulus frame 

(Faust & Anderson, 1967), or question 

| difficulty (Hershberger & Terry, 1965) is 

related to retention. On the assumption 

‘that a constructed response question is 

| more difficult than a multiple-choice item, 

it was predicted that retention would be 
higher when constructed response items 
were used along with the prose materials. 
Although knowledge of results is not 
given with questions, subjects (Ss) should 
be able to answer questions when reading 
the prose passage, thereby providing their 
own knowledge of results. Hence, it was 
predicted that retention of relevant mate- 
tial would be higher than retention of inei- 
dental material. 

! Finally, consistent with the previous 
studies, it was predicted that questions 
would improve retention most when they 
were placed after passages. 


Mernop 
Subjects 


pons hundred twenty-eight introductory psy- 
chology students participated as a course re- 
quirement, 


Design 


ite x 4X 2x 2 x 2 factorial design with 
nett Measures on the last factor was used. 
b Ae were: (a) question location—before 
ai 0 paragraphs, (b) question pacing—after 
eae, 20, 40, or 50 sentences, (¢) content loca- 
fist Ooo tiontelevant material located in the 
( Su Second part of each 10-sentence paragraph, 
ee mode—multiple-choice or constructed 
indore? and (e) retention items—relevant or 
4 ental to questions used in text. 
cata 2 involved one question after each 10 
ces, two questions after every 20 sentences, 
ome after every 40 sentences, or five 
tea after every 50 sentences. The same 
ca, ns Were thus used under all conditions but 
cing of the questions was changed. 


Stimulug Material 


‘our 


A 
Willie word passage concerning the life of 


am James was selected from Psychology: 
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the Science of Mental Life, by G. Miller (1962)? 
The passage was divided into 20 paragraphs of 10 
lines each. For each 10-line paragraph there were 
two five-alternative multiple-choice items, one re- 
lating to the first half of the paragraph and one re- 
lating to the second. These materials were the 
same as those used in the previous study (Frase, 
1967). For one-half of the Ss the questions relating 
to the first part of the paragraphs were placed be- 
fore or after the paragraphs. For the other half 
of the Ss, the questions relating to the second 
part of the paragraphs were placed either before or 
after the paragraphs. These questions, which Ss 
saw when they read the materials, were called 
relevant questions. The other half of the questions, 
over which Ss were tested later but which they 
did not see during reading, were called incidental. 

The alternative responses were dropped from 
all the multiple-choice items, yielding sets of rele- 
vant and incidental constructed response items. In 
some cases minor rewording was necessary. For 
instance, the question: 


There were children in the James 
family. 

1.3 

2.4 

3.5 

4.6 

5. William was an only child. 


became: 
How many children were there in the James 
family? 

The materials were presented in the same man- 
ner as in the previous study (Frase, 1967). Each 
10-sentence paragraph and each question occurred 
on a separate sheet of mimeographed 4 X 11 inch 
paper (a total of 40 pages). The sequence of ques- 
tions and paragraphs which each S saw was de- 
termined by the experimental condition to which 
he had been assigned. . 

The criterion retention test (40 items) consisted 
of both the relevant and incidental multiple-choice 
items already constructed for use with the prose 
passage. The criterion test was found at the end 
of the prose materials. The entire package of ex- 
perimental materials was sealed so that it would 
not be opened until instructions to do so were 
given. 


Procedure 


The experiment was administered to all 128 8s 
in a large auditorium. As Ss reported for the ex- 
periment they were randomly assigned experi- 
mental materials and directed to alternate seats. 
The Ss were instructed not to look at the materials 
until told to do so. There were four monitors used 


to maintain control over Ss’ behavior. 


2 Permission for the experimental use of these 
copyrighted materials was kindly granted by the 
publishers, Harper & Row, Inc, 49 East 33rd 
Street, New York, New York. 
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After all Ss had been seated, instructions were 
read stating that this was an experiment to find 
out how much people can learn from reading ma- 
terial. The Ss were told to read each page of ma- 
terial carefully and to turn each page face down 
after they had read it. They were not to review or 
look back at any page after they had once read it. 
They were told to try and answer the questions 
when they encountered them, and that a final 
test would be found at the end of the reading 
material. When they completed the reading task 
they were to go on to the final test. # 

The Ss were asked if they had any questions, 
after which the instructions were again read. The 
Ss were then told to open their materials and to 
begin. 


Resvuts 


Question Location 


As in the previous studies, questions 
facilitated retention more when they were 
placed after the prose passage (F = 10.43, 
df = 1/96, p < .001). The means were 
11,49 and 13.02 for the before and after 
treatments, respectively. 


Retention Items 


Retention of the relevant information was 
significantly higher then retention of the in- 
cidental information (F = 28.6, df = 
1/96, p < 001). The means for the rele- 
vant and incidental information were 12.93 
and 11.59, respectively. In the previous 
study a control group (which merely 
read through the prose without receiving 
questions) was run and its mean was 
equivalent to the average of all experi- 
mental groups on the incidental test items. 
The mean of 11.59 (N = 128) is the aver- 
age of all groups from the present study 
on incidental test items. This mean is the 
contro] mean reported in Figures 1-3 below. 


Question Pacing 


Figure 1 presents the data on the inter- 
action between question Position and pac- 
ing of questions (interaction F = 3.2, df = 
3/96, p < .05). It can be seen that the 
advantage of the posttreatment groups be- 
came larger the more frequent the ques- 
tions. Conversely, the disadvantage of plac- 
ing questions in front of the Passages was 
strongest when the questions occurred 
most frequently. Posttreatment questions 
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SENTENCES BETWEEN QUESTIONS 


Fic. 1. Retention as a function of question posi- 
tion and pacing. 


evidently shape or elicit reading skills, 
or mathemagenic behaviors (Rothkopf, 
1965), more effectively with frequent ques- 
tions. On the other hand, when placed be- 
fore passages the questions lose the 
capacity to arouse and maintain thos? 
skills. The extremely low mean for the 
10-sentence condition when questions prt 
ceded the passages indicates that consider- 
able information was lost. The locus of this 
lost information is important because, if 
were lost from incidental material, it cou 
be stated that prequestions focus attention 
on relevant material. As a matter of a 
the F ratio for the three-way interactiof £ | 
Question Location x Pacing X es: 
tion items was fractional. Evidently, ie 
quent prequestions interfered in’a sim! 4 
manner with retention of both relevant oo 
incidental information. The most eer 
able conclusion seems to be that aa 
prequestioning tends to destroy contin 
of the prose materials. 

Figure 2 reports the data concern 
interaction of pacing with relent "O8). 
(interaction F = 3.24, df = 3/96, a es) 
Obviously, retention of the mele He, 
terial was depressed with frequen! call 
tioning. This depression is statisue®’ 
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_ Mic. 2. Retention as a function of retention 
item and question pacing. 


10 20 30 40 


Independent of question position and there- 
fore must be due to the size of the passages 
between questions. The effect of questions 
evidently becomes more precise, excluding 
More irrelevant information, as questions 
become more frequent. The general con- 
clusion seems to be that if questions are 
frequent enough they will selectively 
Teinforce retention of prose content, 
whether they come before or after para- 
te With frequent postquestions gen- 
3 al retention is maximized, but at the same 
ime differentiation of relevant from inci- 
ental material becomes more precise. 
eben clear that two processes occur 
a adjunct questions are used effectively 
a Prose materials. First, selective rein- 
“Shed of the relevant material, and 
i; ale development of effective read- 
te, : aviors. The first process, which 
ait of S upon question pacing, is independ- 
is edad location; the second process 
ingent upon postquestioning. 


Content Lo ites 
ss 3 displays the interaction be- 
tion oe of the material and reten- 

Nem (interaction F = 5.6, df = 1/96, 


Rerention of Prose MATERIAL 
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p < .025). If close proximity of question 
and related material were a critical varia- 
ble, then there should have been a signifi- 
cant interaction between question location 
and content location, which there was not. 
On the contrary, Figure 3 says that, regard- 
less of pacing or location of questions, higher 
incidental retention was achieved if the in- 
cidental material followed the relevant 
material. This finding suggests that in the 
previous study (Frase, 1967) part of the 
observed difference between relevant and 
incidental retention was due to the design 
of the materials—the relevant material 
was always in the second part of each 
paragraph. To explain the depressed scores 
when incidental material is located early 
in the paragraphs, regardless of question 
location, it is necessary to assume that Ss 
know that the relevant material will be 
located in the later portion of the para- 
graphs. If this assumption is true, then 
it can be stated that the results show that 
Ss will skip over content (rejecting in- 
formation) to get relevant information, 
but once they have gotten relevant infor- 
mation they will not skip over the re- 
maining material. 

The assumption that Ss know the loca- 
tion of relevant and incidental material, 


en 
10.90 
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3. Retention as & function of retention 
fates position of content within each 10-sen- 
tence paragraph. 
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whether questions precede or follow para- 
graphs, seems reasonable since, for any 
one of the 32 experimental groups, the rele- 
vant material was always located in the 
same position within the paragraphs. 
Hence, Ss could learn to expect the rele- 
vant material in a certain portion of the 
passage. The only problem with this as- 
sumption, upon which the information re- 
jection explanation rests, is that one might 
expect an interaction of Content Loca- 
tion x Retention Item X Pacing, that is, 
questions would have to be frequent in or- 
der for Ss in the postquestion group to dis- 
cover that the relevant material was lo- 
cated in a certain place within paragraphs. 


Question Mode 


The final factor explored in the present 
study—question mode—showed no signifi- 
cant differences between multiple-choice 
and constructed response items. 


Discussion 


The present study directly replicated 
earlier research in two respects. Questions 
which were placed after prose paragraphs 
had both a specific and general facilitative 
effect, and questions had a more powerful 
effect upon the retention of relevant in- 
formation as opposed to incidental infor- 
mation. It was quite clear from the data 
that the development of effective reading 
behavior was contingent upon the location 
of questions and also upon the frequency 
with which the questions occurred. 

To say that these reading behaviors were 
“developed” is somewhat misleading. Ef- 
fective reading might be the result of a 
respondent process, in which case the read- 
Ing passage would elicit general problem 
solving behaviors already existing in S's 
repertoire. Increasing the frequency of ques- 
tions would maintain these behaviors at a 
high level but there would be little im- 
provement in these skills as readin - 
tinued. On the other hand, task iateait: 
ate behaviors might not exist in §’s 
repertoire, hence the postquestions would 
selectively reinforce the more appropriate 
behaviors. This instrumental learning 
should be reflected in a gradual improve- 
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ment of the posttreatment over the pre. 
treatment groups over time. The present, 
study did not explore this relationship and 
hence it leaves the question of the precise 
nature of posttreatment facilitation un. 
answered. 

The data of the present study also 1 
vealed that close proximity of questions to 
their relevant content (within. 10 sen. 
tences) was not a critical variable for re 
tention. Retention of relevant information 
was high regardless of the pacing of ques- 
tions or the location of the relevant in- 
formation within each paragraph. On the 
other hand, surprisingly, the incidental 
material was retained best when it fol- 
lowed the relevant material. The explana- 
tion offered previously for this finding was 
that Ss will skip through a passage to get 
to relevant information, but once having 
read the relevant information they will 
continue to read the remaining information, 
This explanation is consistent with the 
view that attention involves information 
rejection (Berlyne, 1965; Schroder, 
Driver, & Streufert, 1967), and contrary 
to the drive reduction hypotheses which 
asserts that, once having read the rele 
vant information, Ss’ uncertainty is 1 
duced and hence the remaining informa- 
tion will be nonreinforcing. 

Although it was predicted that the most 
difficult questions (constructed response 
would lead to highest retention, there was 
no significant effect of question mode upo” 
retention. The five-alternative multiple 
choice questions were rather difficult for 8s, 
and it is probable that the form of pa 
ple-choice items used in the present re! 
did not provide an adequate test of the hy- 
pothesis. True-false alternatives m™! 
have been more appropriate. 
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RETROACTIVE FACILITATION IN MEANINGFUL 
VERBAL LEARNING 


DAVID P. AUSUBEL, MARY STAGER, anp A. J. H. GAITE 
Ontario Institute for Studies in Education* 


In order to ascertain whether retroactive interference occurs in mean- 
ingful verbal learning and retention, the experimental conditions 
favoring such interference were maximized by using both unfamiliar 
and conflicting original and interpolated learning materials. The 
effects of interpolated learning (Buddhism) and of overlearning of 
the original material (Zen Buddhism) were tested in a 2 x 2 factorial 
design, using 156 12th grade pupils. Both independent variables sig- 
nificantly facilitated the retention of the original material (over- 
learning: p < .01; interpolation: p < .05). The facilitating influence 
of interpolated learning was attributed to the rehearsal and clari- 
fication of the original material which it presumably induced. The 
absence of a significant interaction term indicated that prior over- 
learning did not differentially affect the later facilitating effect of 


interpolation. 


The phenomenon of retroactive inter- 
ference in verbal learning has been clearly 
demonstrated in many studies which have 
used nonmeaningful and unconnected ma- 
terials, chiefly nonsense syllables. How- 
ever, there is much doubt as to whether 
retroactive interference occurs when con- 
nected material is meaningfully learned 
(i.e., when it interacts on a nonarbitrary, 
substantive basis with established ideas in 
cognitive structure) . 

Tn general, those studies with connected 
material which demonstrated the occur- 
rence of retroactive interference have de- 
manded verbatim recall of material (e.g., 
Jenkins & Sparks, 1940; King & Cofer, 
1960; Slamecka, 1959, 1960a, 1960b, 1962). 
Further, Mehler, and Miller (1964) ob- 
tained retroactive interference for the 
syntactic, but not for the semantic, aspects 
of potentially meaningful sentences, and 
Newman (1939) demonstrated retroactive 
interference for nonessential, but not for 
essential, details of a narrative. 

_The majority of studies (Ausubel, Rob- 
bins, & Blake, 1957; Hall, 1955; McGeoch 
& McKinney, 1934; Mehler & Miller, 
1964; Newman, 1939) requiring substan- 
tive (as opposed to verbatim) recall of 
connected verbal material have failed to 
demonstrate clearly the operation of retro- 
active interference. Two of these studies 
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(Ausubel et al., 1957; Mehler & Miller, 
1964), in fact, found that material similar 
to the original material and interpolated 
between original learning and the tests for 
retention of such learning led to retroactive 
facilitation. But a recent study by Entwisle 
and Huggins (1964) indicated that, when 
engineering students were tested on & set 
of principles in electrical circuit theory, the 
interpolation of a highly similar set of 
principles before testing produced signifi- 
cant retroactive interference. It is de- 
batable, however, whether the type of 
learning involved was nonarbitrary and 
substantive in nature; it is quite possible 
that the students may have learned the 
material, which was essentially mathemat- 
ical (formulae, etc.) rather than verbal, 
by rote. Hence, it is concluded that there 
has been no definitive demonstration of the 
retroactive interference phenomenon 
studies requiring the meaningful (non- 
arbitrary and substantive) learning and 
retention of connected verbal material. 
Traditionally, retroactive interference 
has been explained in behavioristic terms: 
Specific responses (from the orginally 
learned material) are lost (forgotten) be- 
cause they are superseded by competing 
associative tendencies (from the interpo” 
lated material) having greater relative 
strength. A major variable in determining 
the amount of forgetting is the similarity 
of responses in original and interpolate 
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activities (Osgood, 1953) where increased 
{  imilarity, short of identity, leads to in- 
creased interference. This relationship has 
been experimentally verified only in stud- 
ies of rote learning. 

In situations involving the substantive 
retention of potentially meaningful ma- 
terial, the applicability of the behavioristic 
explanation is questionable for, as noted 
earlier, there has been no agreement con- 
cerning the effects on retention of similar 
interpolated materials. In one such situa- 
tion, Ausubel et al. (1957) found that an 
interpolated passage, which compared 
Buddhism and Christianity and which was 
substantively similar to the Buddhism 
passage that was tested, induced retroac- 
tive facilitation. This result suggested the 
interpretation that in the case of meaning- 
ful learning, where new concepts and prop- 
ositions are nonarbitrarily and substan- 
tively related (anchored) to existing ideas 
in cognitive structure, the newly learned 
material is protected, by virtue of such 
anchorage, from the interfering effects of 
subsequently encountered competing stim- 
uli and responses (Ausubel et al., 1957). 
More important than the similarity varia- 
ble for learning and retention in these 
circumstances, it was hypothesized, are 
such variables as the availability of rele- 
vant anchoring ideas in cognitive structure, 
their stability and clarity, and their dis- 
criminability from the learning material. 

These authors proposed therefore that 
the influence of interpolated learning on 
retention is not necessarily & function of 
similarity of original and interpolated 
materials; instead it depends on whether 
or not the interpolated passage increases OF 
decreases the discriminability of the origi- 
nal passage from its anchoring concepts in 
cognitive structure and hence counteracts 
or promotes irreversible reduction (for- 
getting), that is, the process whereby the 
originally learned material is reduced to a 
least common denominator of and thus is 
ho longer dissociable (retrievable) from the 
ideational system in which it is em 
bedded. 

The present study was designed to 
discover whether retroactive interference 
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could be demonstrated in a learning situ- 
ation that involved the substantive re- 
tention of potentially meaningful material 
and that was more analogous both to the 
Entwisle and Huggins (1964) study and 
to experiments demonstrating retroactive 
interference with verbatim recall than was 
the Ausubel et al. (1957) study. To sat- 
isfy these conditions, the original and 
interpolated materials had to be both un- 
familiar and sufficiently similar to each 
other to engender confusion and conflict. 
Thus, the interpolated passage, & discus- 
sion of Buddhism, was highly similar to 
and conflicted with basic concepts in the 
originally learned passage (which dealt 
with Zen Buddhism), and both passages 
were generally unfamiliar to the experi- 
mental sample. 

The present. experiment also investigated 
the operation of another variable whose 
effect on retroactive interference with the 
retention of nonmeaningful material and 
with the verbatim recall of connected ma- 
terial (Slamecka, 1959, 1960a) is well es- 
tablished. It is generally accepted (Sla- 
mecka & Ceraso, 1960) that susceptibil- 
ity to retroactive interference after rote 
learning is inversely related to the level 
of verbatim acquisition of the original 
material. To investigate the effect of over- 
learning on retroactive interference or fa- 
cilitation in @ meaningful learning con- 
text, this experiment was designed so that 
certain Ss reread the original Zen Buddhism 


passage. 
Meruop 


Subjects 
le ceria of afl stu- 
dents (91 male and 65 female). The Ss were drawn 
mbership of all of the Grade 13 
from the total rtiban hig schools and consisted 
of those students who were present for all four 
sessions of the experiment. These sessions, given 
h school on the same day, took @ maxi- 
d were conducted during 
a period of 1 week. 


Learning Passage: 
ment 

‘The material used to investigate retroactive In- 
terference consisted of two passages that on the 


basis of content analysis were thought to be highly 
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similar and conflicting. The first (original) passage 
(approximately 2,200 words in length) was con- 
cerned with the history, sacred literature, doctrine, 
and ethical teachings of Zen Buddhism. The second 
(interpolated) passage (approximately 2,100 words 
in length) dealt with similar topics in the Buddhist 
faith. 

‘A third passage (approximately 1,500 words in 
length), which dealt with the causes and types of 
drug addiction, was presented instead of the Bud- 
dhism passage to control group Ss. Because of its 
totally different content, it was presumed that this 
passage would not interfere with the Zen Bud- 
dhism passage. 

The material in all three passages was selected 
on the basis of its unfamiliarity to almost any high 
school student. Hence the interpolated material 
(the Buddhism and drug addiction passages) dif- 
fered for the experimental and control groups only 
in degree of similarity to the original (Zen Bud- 
dhism) passage and not, presumably, in familiarity. 
Empirical confirmation of the unfamiliarity of 
these passages was obtained when naive Ss who 
had not studied the material in question made 
scores on the respective tests that were not signifi- 
cantly better than chance. 

A 85-item multiple-choice test on Zen Bud- 
dhism was used to measure the learning perform- 
ance of all Ss. Before the data were analyzed, it 
was decided to eliminate four items from the test. 
These were items on which the experimental sam- 
ple did more poorly than chance. In addition, two 
of these items had negative indexes of discrimina- 
tion, that is, Ss in the bottom quartile (as deter- 
mined from total test scores) performed better on 
these two items than did Ss in the top quartile. 
The corrected split-half reliability of this shortened 
(81-item) version of the test was .73. Scores 
showed a satisfactory range of variability and 
their distribution did not deviate significantly 
from the normal curve. 

Procedure 


At the beginning of each 40-minute session, Ss 
spent approximately 5 minutes reading instruc- 
tions, The balance of the session was available for 
reading the passages and, in the final session, for 
taking the test of Zen Buddhism; no Ss appeared 
to have difficulty in completing either type of task. 

In the first session of the experiment, all Ss 
studied the Zen Buddhism passage. They were 
told, with this passage and with the Buddhism and 
drug addiction passages, that they were to read 
at their customary speed, that they were not to 
turn back once they had completed reading a page, 
and that they would be examined on the material 
at a later time by means of a multiple-choice test. 
(They were not actually tested on any material 
other than the Zen Buddhism passage, but the 
anticipation of a test on each passage was thought 
necessary to sustain and equate ivation i 
pe a motivation in all 

After the first session, Ss were assigned to one 
of four groups, according to a 2 X 2 factorial de- 
sign. Groups A and D were to receive the over- 
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TABLE 1 
Mean Scores, CELL VARIANCES, AND FRn- 
QUENCIES OF Four TREATMENT GROUPS 
on Test or Zen Buppuism 


Group IBP OZBP M Ss? N 
A present | present | 16.08 | 19.41 | 38 
B present | absent | 11.36 | 14.31 | 39 
Cc absent | absent | 9.89 | 16.46 | 87 
D absent | present | 14.55 | 16.61 | 42 
Note—Abbreviated: IBP = Interpolated 


Buddhism passage, OZBP = Overlearning of Zen 
Buddhism passage. 


learning treatment, whereby they restudied the 
Zen Buddhism passage during the second session. 
Groups A and B were to receive the interpolated 
learning treatment, studying the Buddhism mate- 
rial in the third session. 

Two stipulations were made in assigning 8s to 
treatment groups. First, because high school girls 
have been found to have higher verbal ability than 
high school boys (e.g., superior performance on the 
verbal portion of the School and College Ability 
Test, as shown by Ausubel & Fitzgerald, 1962), 8s 
were assigned to groups in such a way that the 
ratio of girls to boys in each group was equal. (A 
chi-square test showed that, after eliminating 
those Ss who were not present for all four sessions, 
the male-female ratio in each treatment group did 
not depart significantly from equality.) Second, 
because of possible differences in ability between 
the populations of the two schools, equal propor- 
tions of students from each school were assigned 
to each treatment group. (A chi-square test showed 
that, with the 156 Ss present during the entire 
experiment, the proportion of Ss from each school 
in each group was not significantly different.) 
Aside from these two restrictions, assignment © 
Ss to groups was made on a random basis. 

The second session took place two days after 
the first. Groups A and D studied the Zen Bud- 
dhism passage a second time; Groups B an 
studied the unrelated drug addiction passage. 

In the third session, 2 days later, Groups A and 
B studied the potentially interfering Buddhism 
passage, and Groups C and D studied the unre- 
lated drug addiction passage. 

During the final session, which took place, 3 
days after the third, and 1 week after the initt 
session, all Ss were tested on the Zen Buddhism 
passage. They were instructed to answer all ques- 
tions, and not to turn back once a page was com- 
pleted. 


Resuuts AND Discussion 


Effect of Interpolated Learning on Reten- 
tion 


A comparison of the means (Table Hl 
on the 31-item test of Zen Buddhism in? 


& oe ne 
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cates that the interpolated Buddhism pas- 
sage, when compared with the irrelevant 
drug addiction passage, did not interfere 
with, but in fact facilitated, retention of 
the original Zen Buddhism passage. 
Analysis of variance, following Winer’s 
(1962) method for dealing with unequal 
cell frequencies when cell variances are 
homogeneous, shows that the overall facili- 
tating effect of the similar interpolated 
passage is significant, F = 5.10, df = 1/52, 
p < .05. A nonsignificant interaction term 
indicates that the interpolated Buddhism 
material affected retention in the same way 
regardless of degree of original learning. 
The evidence is clear, therefore, that in 
this meaningful learning situation, retro- 
active interference did not occur when a 
connected and potentially meaningful pas- 
sage was interpolated between material to 
which it was highly similar and a test for 
substantive retention of such material. In- 
deed, the interpolated passage appears to 
have had an effect that was small but 
reliably facilitating in comparison to that 
produced by a dissimilar and nonconflict- 
ing alternative passage. Thus it is sug 
gested that the learning of the Buddhism 
passage may have served as a review and 
clarification of the Zen Buddhism ma- 
terial, thereby increasing both its stability 
and clarity and its discriminability from 
its anchoring concepts (presumably, re- 
lated aspects of Judaism and Christianity) 
in cognitive structure. 
The interpolation of similar and con- 
flicting material between the meaningful 
ferning of a Zen Buddhism passage and a 
pet: test of its retention may have had a 
piietng effect on the retention of the 
a Buddhism material because it induced 
s to compare, on their own, the two sets 
¢ material, and thus (a) to delineate 
ae similarities between them that de- 
ne their common differences from those 
established ideas in cognitive structure to 
ree both were related in the course of 
ene, and (b) to delineate the dif- 
erences between them. Both of these com- 
are operations conceivably could have 
aes. retention of the original learn- 
ia material by clarifying and sharpening 
8 distinctive features and by increasing 
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its discriminability from anchoring ideas 
in cognitive structure. In addition, these 
comparative operations may have fur- 
ther enhanced retention of the Zen Bud- 
dhism material because they necessarily 
required rehearsal of this material, which 
rehearsal, in turn, increased its stability, 
clarity, and discriminability. 

To the extent that both the interpolated 
material and the original learning material 
share certain common ideas that are dif- 
ferentiable on the same basis from their 
common anchoring concepts in cognitive 
structure, later exposure to the interpo- 
lated material may increase the possibil- 
ity that basic differences between the origi- 
nal material and the anchoring ideas will 
be cognized. In other words, if learning 
passages B and © are conflicting (similar 
but not identical), and thus necessarily 
share certain common differences relative 
to their common anchoring concepts (A), 
exposure to passage CG makes possible the 
delineation of a common set of differences 
between the learning passages (B and C) 
and A, and may thereby make B more dis- 
criminable from A than jf later exposure 
to © had not taken place. On the other 
hand, comparative efforts aimed at de- 
lineating the more specific kinds of dif- 
ferences between original and interpolated 
materials presumably help to sharpen the 
distinctive features of the original ma- 
terial, and may thus indirectly increase 
its discriminability from anchoring ideas 
in cognitive structure. : 

Furthermore, in the process of identify- 
ing and clarifying simple differences be- 
tween original and interpolated learning 
well as more generic differences 
both learning passages 
from relevant anchoring ideas in cognl- 

S must necessarily rehearse 


(i.e., 
storage) thi 
Such rehearsal may 
tion in two ways. First, the 
tion or retrieval of partially forgotten 
material (material in the process of under- 
going obliterative reduction) serves as & 

review of this material, and may 


partial a | ma 
thus enhance its stability and clarity im 


cognitive structure (thereby directly in- 
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creasing its availability or retrievability 
at the time of later testing). Second, the 
greater clarity and stability of the origi- 
nal material resulting from rehearsal may 
indirectly increase its later retrievability 
by enhancing its discriminability from 
those established ideational systems in 
cognitive structure to which it is anchored. 

These findings, if replicated and given 
greater generality, would have far-reaching 
implications for classroom teaching prac- 
tice. Instead of suggesting (as do the 
classical retroactive interference findings 
in the case of rote learning and retention) 
that teachers scrupulously avoid introduc- 
ing similar and conflicting material after 
a typical lesson involving meaningful 
learning, they imply that such material 
should be introduced deliberately. This 
recommendation would be based on the 
expectation that conflicting interpolated 
material would encourage the learner to 
compare related ideas in the original and 
interpolated sets of material, and thus 
facilitate retention of the original ma- 
terial through the influence of such inter- 
vening variables as rehearsal and clarifi- 
cation. 


Effect of Overlearning on Retention 


~ It is evident from Table 1 that a second 
session of studying the Zen Buddhism pas- 
sage improved the retention of this pas- 
sage relative to the groups who studied it 
only once. Analysis of variance indicates 
that this facilitating effect is significant, 
F = 49.90, df = 1/152, p < .01. This 
finding is consistent with the results of 
previous studies (Ausubel & Youssef, 
1965; Reynolds & Glaser, 1964) which 
demonstrated that review facilitates the 
retention of meaningfully learned ma- 
terial. It presumably does so through 
mechanisms similar to, but by no means 
identical with, those postulated above to 
account for the effects of rehearsal. For one 
thing, since overlearning of the original 
material involves both more complete and 
more explicit repetition than that involved 
in rehearsal, the facilitating effect of over- 
learning is accordingly much more pro- 
nounced (see Table 1). 
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The absence of any interaction effect, 
however, indicates that the interpolation 
of conflicting material in this experiment 
did not differentially affect retention of 
the original material for the groups which 
overlearned the latter material and the 
groups which did not. That is, the facilitat- 
ing effect of interpolation was neither 
greater nor less when it was preceded by 
overlearning of the original material than 
when it was not so preceded, and hence 
cannot be attributed in any way to the 
effect of such prior overlearning. 

Although this finding contrasts markedly 
with the comparable, previously discussed 
situation in regard to rote learning, where 
overlearning of the original material has 
been invariably found to diminish subse- 
quent susceptibility to retroactive inter- 
ference, it is nonetheless readily under- 
standable. When, as in the case of rote 
learning, interpolation has an interfering 
effect on the retention of the original ma- 
terial, any factor (e.g., overlearning) that 
increases the associative strength of such 
material quite naturally tends to lessen 
the interfering potential of competing as- 
sociative tendencies. But since the inter- 
polation of conflicting material facilitates 
rather than interferes with the retention 
of meaningfully learned original material, 
one cannot expect overlearning of the 
latter material to interact with the very 
different effects of interpolation in the 
same way as in the case of rote learning. 
However, the fact that the facilitating 
effect of interpolation is as great in a con- 
text of prior overlearning as in the ab- 
sence of such a context, permits the in- 
ference that the mechanisms underlying 
the facilitating influence of interpolation 
are different than those underlying the 
facilitating influence of overlearning; and 
hence that the occurrence of the prot 
facilitating effect of overlearning does not 
preclude the later facilitating effect of 
interpolation. 
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DIFFERENTIAL PREDICTION OF ACADEMIC ACHIEVEMENT 
IN CONFORMING AND INDEPENDENT SETTINGS 


GEORGE DOMINO 
Fordham University 


The grade-point average (GPA) of 4 groups of college juniors, repre- 
senting high and low scorers on the CPI Ac and Ai scales, was ana- 
lyzed to test the hypothesis that conforming and independent 
achievement motivation (as measured by the CPI) is related to 
scholastic achievement reflective of conforming or independent be- 
havior. Specific hypotheses regarding differential achievement as a 
function of Ac and Ai scores were tested and, in general, supported. 
The results obtained underscore the heuristic value of separating GPA 
into subcategories reflective of diverse demands. 


Previous studies with the California 
Psychological Inventory (CPI) have 
clearly demonstrated its usefulness in pre- 
dicting academic achievement in various 
educational settings and with differing 
samples, CPI scales have shown consider- 
able validity in predicting performance in 
mathematics (Keimowitz & Ansbacher, 
1960), in introductory psychology (Gough, 
1964b), in medical school (Gough & Hall, 
1964), in high school (Gough, 1964a; Sni- 
der, 1966), with gifted pupils (Lessinger & 
Martinson, 1961), students of average abil- 
ity (Fink, 1962; Gough & Fink, 1964), mili- 
tary enlisted personnel (Rosenberg, Mc- 
Henry, Rosenberg, & Nichols, 1962), 
National Merit finalists (Holland, 1959), 
and other samples. 

Although all of the above studies re- 
ported positive findings, the differential 
predictive potential of the CPI was not 
maximized since the underlying nature of 
the criterion (grade-point average—GPA, 
or honor-point ratio) was disregarded. As 
any student can well document, equivalent 
grades in different courses do not represent 
equivalent performances. A student’s aca- 
demic performance reflects a variety of 
factors, including personality aspects that 
can enhance or interfere with optimal 
functioning in settings where conformity or 
independence are differentially rewarded. 


*The author is indebted to Harrison Gough for 
suggestions in regard to various aspects of this 
paper, to John Walsh for statistical assistance, and 
to the Fordham University Computer Center for 
granting computer time. 


Gough (1957), in constructing the CPI, 
has explicitly recognized this by including 
two scales of achievement motivation. The 
first of these, Achievement via Conform- 
ance (Ac), identifies those aspects of mo- 
tivation that facilitate achievement in set- 
tings where conforming behavior such as 
acceptance of regulations, a high degree of 
self-discipline, efficiency, and responsibility 
are rewarded. The second scale, Achieve- 
ment via Independence (Ai), identifies 
those motivational aspects that facilitate 
achievement in settings rewarding inde- 
pendence, individuality, self-reliance, and 
creative innovation. 

The present study is an attempt to re- 
late these personality measures of con- 
forming and independent motivation to 
scholastic achievement attained in a set- 
ting rewarding conforming behavior, and 
in a setting rewarding independent be- 
havior, to test the hypothesis that the Ac 
and Ai scales show differential predictive 
patterns in different settings. " 


MertnHop 


A sample of 348 liberal arts juniors attending a 
California state college on a full-time basis bye 
administered a test battery, including the Ac ant 
Ai scales of the CPI, and the D 48, a nonverba 
test of intelligence (Domino, 1964; Gough ; 
Domino, 1963). The distributions of scores oP fl : 
‘Ac and Ai scales were tallied in order to 8° a 
four groups (a) students scoring high on bo' 
scales (HiAc-HiAi); (b) students scoring high te 
‘Ac but low on Ai (HiAc-LoAi); (c) students sco! A 
ing low on Ac but high on Ai (LoAc-HiAi) ; ee 
(d) students scoring low on both scales (Lo. 
LoAi). 
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ords were then consulted to de- 
termine courses taken and grades received by 
these Ss during their first 2 years of college. Only 
§s enrolled in full programs (15 units or more per 
semester) for four consecutive semesters were con- 
sidered. For every course taken by any student the 
instructor was interviewed in an attempt to deter- 
mine whether the particular course rewarded 


conforming or independent behavior on the part 


of the students. 
‘A course was deemed as rewarding conforming 


behavior if it was characterized by emphasis on: 
(a) memorizing of technical terms, definitions, 
poems, ete.; (b) presentation of material through 
lectures; (c) objective type examinations; (d) 
keeping of attendance records; (e) discipline and 
adherence to regulations (eg., no smoking, ab- 
sences justified by written medical reasons); (f) 
clearly defined and frequent homework assign- 
ments emphasizing convergent thinking; (g) rare 
use of visual aids, outside speakers, little varia- 
tion in class routine; (h) close correspondence be- 
tween lecture material and textbook; (+) identical 
assigned readings for all class members; and (j) 
course grade determined by proportional weight- 
ing of various course requirements. 

A course was deemed as rewarding independent 
behavior if it was characterized by emphasis on: 
(a) ideas rather than facts; (b) seminar discus- 
sions, student presentations, or question-answer 
format; (c) no examinations, or examinations in- 
volving essay questions; (d) little concern for at- 
tendance; (e) little explicit emphasis on discipline 
and adherence to school regulations; (f) no home- 
Y work assignments, or assignments demanding di- 
vergent thinking; (g) variety of presentation, aS 
indicated by use of visual aids, tape Teco i 
outside speakers, or other material; (/) little direct 
overlap between class discussions and textbook 
content; (i) suggested readings, or assigned read- 
ings individually tailored to a student's interests; 
and (j) grade determined by consultation wil 
icc or by global evaluation of student’s per- 

lormance. 

Using these criteria, it was possible to label 73 
courses as conforming and 32 as independent. 

Every student’s grades were divided into those 
received in conforming courses and those received 
4 in independent courses? Since it was not possible 
> to contact all instructors concerned, and since 
some students had taken only one type of course 
(typically, conforming courses), a number of stu- 

lents had to be omitted from the sis. 
is Four groups of 22 Ss each were finally retained ; 
\ (> groups were matched for sex. and intelligence 
48 scores). Table 1 indicates the Ac and At 
Scores descriptive of each group, a8 well as the sex 
tatios and D 48 scores. 


—— 


* All grades were converted by a2 honor-point 
tatio formula where A = 4, B = 3, Cc Dise: 
au 0; grades were multiplied by credits Per 

urse and divided by total credits carried. 


Registrar's rec! 
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TABLE 1 


Summary Sraristics For Four 
AcnizvementT GROUPS 


Achievement groups 
Variables ES hal eed oe 
mie | Bar | Mabe | Yet 
CPI Ac range 30-35 | 30-35 | 16-22 | 16-22 
xX 31.9 | 31.5 | 19.5 | 19.1 
sD 1.89 | 1.50] 2.28} 2.10 
CPI Ai range 23-28 | 11-16 | 23-25 | 11-16 
x 24.5 | 14.6 | 23.9 | 14.2 
sD 1.67 | 1.42| 0.78 | 1.37 
Sex composition | 19M | 17 M | 18M | 19M 
3F | 6F | 4F | 3F 
D48 
x 26.8 | 27.2 | 27.0 | 25.9 
SD 5.7 | 6.0 | 5.8 | 6.2 


Note—N = 22 per group. 


The following specific hypotheses were made: 
1. Concerning total GPA (GPAt): 
a. the HiAc-HiAi group should have a higher 
mean GPAt than any of the other groups; 
b. the LoAc-LoAi group should have a lower 
mean GPAt than any of the other groups. 
GPA (GPAc): 
group should have a higher 
the LoAc-HiAi group; 


b. the LoAc-HiAi group should have a higher 
mean GPAi 
These hypotheses were 
ratios across the four mean 
the three GPAs, as shown in 
up comparisons (t tests) were | 

to evaluate the indicated comparisons. 


RESULTS 


Table 2 presents the Xs, SDs, and F 
ratios for the intergroup comparisons of 
GPAt, GPAc, and GPAi. beak 

All four F ratios achieved statistical 
significance at the .01 level; ¢ tests for 
individual comparisons are therefore pet- 
missible. Of the nine é test comparisons in 
involving the six specified hy- 

reached statistical sig- 
e hypothesized 


direction although not significant. et 
For overall jAc-HiAi sub- 


sample was signi: 
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TABLE 2 


Group Comparisons on THREE TyPEs oF GRADE- 
Point Average (GPA) 


ery Independent] otal GPA 
Group 
X |sp| x 
HiAc-HiAi 2.97 |.55) 8.83 |.37)8.15 
HiAc-LoAi 2.66 |.93] 2.35 |.77|2.50 
LoAc-HiAi 49 |.45] 2.70 |.59|2.60 
LoAc-LoAi 2.34 }.48] 2.14 |.51/2.24 
F test 6.77* 
*p < Ol 


other. For conforming GPA, the HiAc- 
HiAi subsample was significantly higher 
than the LoAc-HiAi. For independent 
GPA, the HiAc-HiAi sample was higher 
than the HiAc-LoAi, and the LoAc-HiAi 
was higher than the LoAc-LoAi. The LoAc- 
LoAi sample was significantly lower on 
total GPA than any other group. 


Discussion 


In view of repeated findings (cf. Gough, 
1966) that the roles of Ac and Ai as fore- 
casters of scholastic achievement vary ac- 
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cording to intellectual ability and en- 
vironmental demands, it is important to 
note that Ss in this study represent average 
college juniors, as indicated by a mean 
GPAt of 2.62 across all four groups, as well 
as the D 48 distribution which approxi- 
mates the average for college students 
(Domino, 1964). 

It should also be acknowledged that the 
restrictions imposed in forming the four 
groups negate any possibility of random- 
ness of sampling’ In addition, use of 
junior-year Ss automatically restricted the 
range of possible grades obtained in the 
first two college years, since failing and/or 
marginal students would have been elimi- 
nated. 

Liberal arts juniors were specifically 
selected since their scholastic index was 
based on four semesters of work which 
would include general courses, exposure to 
both sciences and humanities, and a pre 
dominance of nonmajor subjects. 

It is clear that some disciplines are more 
amenable to one type of presentation than 
another. In fact, one may question whether 
the obtained results reflect subject matter 


TABLE 3 
IntERGROuP ComPARISONS | 
All courses Humanities Sciences 
Grade-Point Average 
zg SD t x sD t x SD t 
Total [2 Cie as ade 
HiAc-HiAi vs 3.15 | .38 3.40 | .52 2.90 | .20 
HiAc-LoAi 2.50 | .78 | 3.34%* | 2.46 | .69 | 5.10** | 2.54 | .73 | 2.23* 
LoAc-HiAi 2.60 | .48 | 4.06** | 2.73 | .36 | 4.96% | 2.47 | .38 | 4.607" 
LoAc-LoAi 2.24 | .44 | 7.03** | 2.90 | -41 | 8.51** | 2.28 | .45 | 5.90" 
LoAc-LoAi vs 2.24 | .44 2.20 | .41 2.28 | .45 
HiAe-LoAi 2.50 | .78| 1.92 |2.46| .69| 1.48 | 2.54 | .73 | 1-42 
LoAc-HiAi 2.60 | .48 | 2.45** | 2.73 | .36 | 4.57¢* | 2.47 | .38 | 1-52 
Copferning é 
iAc-HiAi vs. 2.97 | .55 2.88 | .54 3.06 | .68 
LoAc-HiAi 2.49 | 145 | 3.04** | 2.51 | 138 | 2.62%* | 2.47 | .61 | 3.02** 
HiAc-LoAi vs. 2.66 | .93 2.60 | .85 2.72 | .76 a 
LoAc-LoAi 2.34 | .48| 1.37 | 2.66 | .30|.0.31 | 2.02 | .52 | 3.57* 
Independent 
HiAc-HiAi vs. 3.33 | .37 3.68 | .46 2.98 | .32 
HiAc-LoAi 2135 | .77 | 5.15** | 2.30 | .61 | 8.46** | 2.40 | -58 | 4.11** 
LoAc-HiAi vs. 2.70 | .59 2.78 | .46 2.62 | .59 ‘eat 
LoAc-LoAi 2.14 | 151 | 3.26** | 2.96 | ‘38 | 4.09% | 2.02 | .49 | 3.68 
Pan eee een sed Loony Teg a Pale ela a Me re 
*p < .05. 
™ p< 01. 


DIrFERENTIAL Prepicrion or Acapemic ACHIEVEMENT 


differences rather than the interaction of 
style of achievement with instructor’s style 
of teaching. The answer to this question 
was obtained in two. ways: (a) Catalog 
descriptions of the 105 courses were in- 
dependently rated by three judges, not 
acquainted with this study, as either hu- 
manities or science courses. Of the 83 
courses labeled as humanities by at least 
two of the three judges, 26 had been desig- 
nated as independent and 57 as conform- 
ing. Of the 22 science courses, 6 had been 
designated as independent and 16 as con- 
forming. A chi-square analysis, incorporat- 
ing Yates’ correction, gave & nonsignificant 
value of .003. (b) The same intergroup 
comparisons made on GPAt, GPAc, and 
GPAi (see Table 3) were computed for 


“humanities courses only and for science 
courses only. The results of these compari- 


sons, also presented in Table 3, are es- 
sentially identical, with the exception of 


the LoAc-LoAi vs. LoAc-HiAi comparison 
‘which for science courses does not reach 


statistical significance. 

The results obtained underscore the 
heuristic value of separating GPA into 
subcategories reflective of diverse de- 
mands. In this connection, it should be 
noted that Elton (1966) analyzed Omnibus 
Personality Inventory scores for 92 fresh- 
men students having identical depart- 
mental schedules, and reported a slight 
tendency for personality traits predictive 
of grades in English courses to be nega- 
tively related to grades in chemistry. 

Given current curricular and attitudinal 
changes occurring on most college cam- 
puses, it is important to keep in mind that 
not every student can achieve his best in 
4 conformist (or independence-demanding) 
setting. In the past, the typical curriculum 
may have demanded and rewarded con- 
forming behavior. Today, wider use of 
honors programs, undergraduate seminars, 
interdepartmental majors, 3-year bacca- 
laureate programs, and other curricular re- 
forms, seem to emphasize independent be- 
havior. Rather than fit the student to the 
curriculum as is presently done, it might 
be extremely worthwhile to fit the cur- 
ticulum to the student by providing each 
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student with the type of setting which most 
effectively utilizes his potential. 

In commenting on the results obtained in 
this study, it must be noted that they de- 
rive from a particular college setting and 
may not be generalizable to other educa- 
tional institutions. Both Holland (1959) 
and Thistlethwaite (1959) have presented 
evidence of variation in institutional en- 
vironments, leading to somewhat different 
patterning of variables predictive of aca- 
demic achievement within these settings. 
The HiAc-HiAi student, very likely, will 
do well in any academic environment. For 
the HiAc-LoAi and LoAc-HiAi students, 
however, there is a distinct and under- 
standable interaction between achievement 
and the demands of the environment. 
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IN OLDER PERSONS 
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te 
Research evidence, accumulated over but investigators have been criticized for 
the past three decades, appears to support not controlling for original learning, an 
the hypothesis that elderly Ss, in com- omission which made it difficult to de- 
parison with younger ones, exhibit per- termine whether older Ss were poor learn- 
formance deficits on a wide variety of ers, had poorer memories, OF both. 
learning tasks. However, the magnitude 
of the decrement is believed to be de- the relatively less adequate performance 
pendent upon the nature of the material to 
be learned, Studies by Gilbert (1941), and learn. Donahue (1956) 
\ Korchin and Basowitz (1957) have indi- titudinal and emotiona’ ; 
‘cated that the elderly seem to be less pear in the later years 0! ld 
capable than the young of dealing with deleterious effect on learning. Welfor: 
novel and difficult material which cannot i : 
be readily integrated with earlier experi- shown increasin 
: ence. “Old” learning and retention is be- 
lieved to be less subject to deterioration. 30 were reluctant to sul Be Suis bs 
"Testing memory function on a number 0 hal eye B Pe 
different variables, Gilbert (1941) — dis- persuaded to participate in & learning we 
a covered that the greatest decrement in they frequently revealed amines 
the performance of older persons ap- ing the quality of their a o ae 
peared in the learning of a Turkish-English Donahue (1956) has sugges! 
vocabulary and in the acquisition and re- manner in k 
tention of vatred-asnootiee he diffi individual may be an impo 
culty encountered in these tasks seemed 
3 lie in the lack of logical connections ‘a 
le words presen king it necessary 00) 
establish be Sa 4 effect of variation of oe sage ee 
Although there is substantial evidence learned, future shee hy cruotions givet 
that learning ability tends to decrease variation 10 the kinds of ins 
‘ash. advancing) eae remlts 05 V7" ee imental work has 
search on retention have been more equiv- qe saetormatice 88:6 
ocal (Wimer, 1960; Wimer & Wigdor, 4 
1958). Not only have studies on relearning 


| and recall yielded conflicting 
i 261 
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been directed toward the combined effects 
of experimentally induced verbal stress and 
measured anxiety in college students 
(Sarason, 1956; Sarason, 1957; Sarason, 
1958; Sarason & Palola, 1960). Perform- 
ance levels have been observed to. change 
if the individual was told that a test was a 
measure of intelligence or that his score 
would be compared with that of others. 

The purpose of the present study was to 
determine whether age-related differences 
in paired-associate learning and retention 
could be modified by the experimental 
manipulation of two motivational varia- 
bles. It was hypothesized that (a) the 
uncommon paired-associate list, represent- 
ing the more novel and difficult material, 
is acquired and relearned in a greater num- 
ber of trials by elderly persons under 
challenging instructions and (b). the 
difference in performance between young 
and old on the identical task would be 
smaller under supportive than under chal- 
lenging or- neutral experimental treatments. 


Mernop 


Subjects 


One hundred twenty noninstitutionalized Ameri- 
can-born white males, half of whom were between 
the ages of 
the 65 to 75 


mined peti s. Th 

e uy; ‘orndi 
tues Ong ism . 2 Chon mean raw 
score of correct ¥ 
Gallup Thorndike Voting Sample of ogk oa 
cluded in the study, Socioeconomic status ‘was 


evaluated on the basis of Reiss’ 196 ioeco- 
nomic index. oe 


Learning Materials 


The learning tasks were two lists of paired- 
associate nouns of 10 pairs each. The stimulus 
words, selected from the Kent-Rosanoff 100 Word 
List (Russell & Jenkins, 1954), were identical for 
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both lists. The response items differed in level of 
difficulty (Ross, 1967). One list contained Pairs 
which, according to a pilot study conducted by the 
experimenter (Z) were found to be relatively easy, 
The pairs were selected on the basis of an associa. 
tive strength between 11% to 55% as determined 
by Tresselt’s' norms. These norms were based on 
the responses and the frequency of responses given 
to the stimulus words by groups comparable to 
those used in the present study. The associative 
strengths of the words chosen were equivalent for 
both age groups. 

The response words representing the more diffi- 
cult. words were selected on the basis of 0% to 
8% associative strength. For example, a response 
term of a paired associate of zero associative 
strength never appeared as a response to a stimu- 
lus word on Tresselt’s norms, while an associative 
strength of .7% or 8% represented only one re- 
sponse to a stimulus word given by her younger 
and older Ss, respectively. No words were included 
in the two lists which were beyond the eighth 
grade difficulty level on the Thorndike-Lorge 
(1951) lists. The easy paired associates have been 
designated “common” and the more difficult paired 
associates, “uncommon,” throughout the experi- 
ment. The stimulus and response terms were 
printed in black letters ¥2-inch high X ¥%-inch 
wide on 3 X 5 inch unlined white index cards, On 
one side of the card appeared a stimulus word, and 
on the reverse side the response word. An electric 
timer, placed out of S’s view, was set to flash 
on and off for 1 second at 5 second intervals. 


Procedure 


Three to 5 weeks before Ss participated in the 
experiment, each S was administered the GT Vo- 
cabulary Test. No § was informed of the purpose 
of the test but was merely told that the psycholo- 
Sist was interested in finding out which words were 
considered by most people to be easy and which 
were considered hard. r 

Those Ss who had fulfilled the vocabulary cri- 
terion were sent a letter asking them to participate 
in a research project to be held at the day or rece 
reation centers. The 120 volunteers were assigned 
randomly to one of six experimental groups a5 
follows: Twenty of the old and 20 of the young 
Ss comprising Groups 1 and 4 were given neutr: 
instructions; 20 of the old and 20 of the young Ss 
comprising Groups 2 and 5 were given supportive 
instructions; 20 of the old and 20 of the young 
comprising Groups 3 and 6 were given challenging 
instructions. 

Each § was tested individually by E. After 
S was seated, he was given either the neuti , 
supportive, or challenging instructions as follows: 


Neutral Instructions 


I am going to show to you and read to you 
a list of words, two at a time. When I finish read- 


*Margaret Tresselt, personal communication, 
October 2, 1963. 
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ing the two words that go together, I am going to 
say one word of each pair and ask you to tell me 
the word that went with it. For example, if the 
words are EAST-west, GoLD-silver, then when I say 
the word zast, I would expect you to say (pause) 
west. And when I say the word coup, you would, 
of course, answer (pause) silver. Do you under- 
stand? 


These instructions were repeated for the groups 
receiving the supportive and neutral instructions 
after the reference to Columbia University. 


Supportive Instructions 


I need your help with a research project that 
I am doing for Columbia University.... I am 
interested in finding out something about the 
characteristics of words. Your performance is not 
my main concern. My purpose in asking you to 
do this task is just to find out which words go to- 
gether more easily and which do not. 


Challenging Instructions 


I am doing a research project for Columbia 
University... The ability to learn this material 
is a good test of your intelligence, not of what 
you know but of how well you can learn new 
things. It’s to your advantage, then, to do your 
best to show how capable you are, how bright 
you are in relation to people of your own age. 
Listen carefully and do your best, for your score 
will be compared with those of other subjects. 


Acquisition of Paired Associates 


The paired-associate lists were presented to Ss 
by the anticipation method. The order of presenta- 
tion of the lists was alternated so that half of the 
Ss received the common pairs first, and the bal- 
ance of the Ss, the uncommon pairs first. The 
held a card in front of the seated S and called out 
the stimulus word. After a 5-second interval, the 
card was turned to expose the response term for 
5 seconds. When the entire list had been presented 
in this manner, another 5-second interval was al- 
lowed while the cards were shuffled. On the next 
trial the stimulus word was again shown to 8 while 

called the stimulus card as it appeared, but on 
this, and subsequent trials, S was required to give 
the response term. If the correct association was 
supplied, Z repeated the word, reversed the card 
and proceeded to the next pair after a 
interval. If § made an error or failed to give any 
Tesponse, Z supplied the correct association. The 
8s who did not reply within the allotted time were 
fiven an additional § seconds during which time 
the cards were reshufiled to randomize the order 
of presentation of the word pairs. This procedure 
Was repeated until a criterion of two correct recita- 
tions of the list was met. The relatively slow rate 
fs Presentation was selected to allow sufficient time 
or the elderly Ss to make a response. Persons who 
failed to learn the list within 30 trials were ex 
cluded from the study. 


During the % hour interval between the ac- 
quisition and relearning of the paired-associate 
list, S and Z were engaged in working on simple 
jigsaw puzzles, 


Relearning of Paired Associates 


The E introduced the relearning part of the ex- 
periment as follows: “I am going to read to you 
and show to you the same words that you had 
before. As you recall, I shall give you a word and 
you will give me the word that goes with it.” The 
same instructions as given before the learning trials 
were repeated except that the reference to Colum- 
bia University was omitted. The procedure con- 
tinued as before. Upon termination of the task, 8 
was dismissed and told to return in ¥2-hour to 
continue the experiment. 

The Ss spent the interval before the learning 
and relearning of the second list pursuing their 
usual activities in the day center. At the stipulated 
time Ss were tested for their learning of the second 
list. The E began the session by saying, “Here are 
some more words that I am going to show to you 
and read to you....” The exact procedure as out- 
lined for the presentation of the first list was fol- 
lowed for the acquisition and relearning of the 
second list, An anxiety self-rating scale, devised 
by E, was administered at the end of the experi- 
mental session to enable Ss to rate themselves on 
a 5-point continuum according to the degree of 
tension and anxiety experienced during testing. 
Upon completion of the scale, an autobiographical 
questionnaire was given to Ss. Several weeks 
after all Ss were tested, five psychologists evaluated 
12 recorded experimental sessions of the elderly 8s, 
equally divided among the three treatments. 


Rasvuits 


Before analyzing the results, the distri- 
bution of relevant control variables was 
examined. Separate 2 X 3 analyses of 
variance for age and type of instruction 
were computed for socioeconomic status 
scores, GT vocabulary scores, anxiety 
self-rating scores, and number of years of 
schooling. The F ratios were not signifi- 
cant for socioeconomic status, GT vocabu- 
lary, and anxiety self-rating. However, 
the F ratio for education was signifi- 
cant (F = 54.10, p < 001). There were 
no significant interaction effects for any 

f the variables. 
i Additional analyses were made to de- 
termine the extent to which differences in 
education might be related to learning 


trials. When education was correlated with 


acquisition scores for old and young sepa- 
ee the correlation was not significant 
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TABLE 1 
Means, STANDARD DEVIATIONS, AND DIFFERENCE Scorzs or NuMBER OF TRIALS FoR LEARNING 
Common AND UNcomMON PatRED-ASSOCIATES TO Crirerton By YOUNG AND OLD 
Groups UNDER THREE ExPERIMENTAL TREATMENTS 


Common Uncommon 
ots ot ee ee 
Eagan Young old Young old 
Difference 
x | sp| & SD x SD x SD 
Neutral 2.20 | .40 | 3.30 | 1.14 1.10 5.80 | 2.44 | 16.40 | 5.32 10.60 
Supportive 2.35 | .57 | 2.95 | 1.00 65 5.20 | 2.02 | 13.20 | 4.99 8.00 
Challenging | 2.40 | .26 | 3.35 | 1.10 95 5.50 | 2.06 | 22.00 | 6.93 16.50 


Note.—N = 20 in each subgroup. 


for either group (.04 and .20, respectively). 
Thus, within each age group, education 
bears no relationship to efficacy of per- 
formance, However, a correlation of —.44 
between education and learning trials for 
young and old Ss combined was signifi- 
cant. Age also correlated significantly with 
acquisition trials (r = .75). A partial cor- 
relation between age and learning trials, 
eliminating the effects of education, was 
run and the results showed that age and 
performance were correlated .68. This high 
correlation probably owes much of its 
magnitude to the wide age spread between 
the young and old groups. When the in- 
fluence of age was partialed out, education 
correlated only .03 with acquisition trials, 
indicating that age was the factor most 
responsible for the differential perform- 
ance between the groups. 

Under the challenging treatment older 
Ss required approximately one-third more 
trials to fulfill the learning criterion for 


TABLE 2 
Dirrmrencn ScoRES BETWEEN YOUNG AND OLD 
Supszcrs unpeR NEuTRAL, SuPpPoRTIVE, AND 
CHALLENGING TREATMENTS FOR ACQUISITION 
or Uncommon Parrep-AssocraTEs 


‘Treatment Maite SD diff t 
Neutral 10.6 5.1 
ih 2.6 | 1.62 
Supportive 8.0 5.1 
8.5 | 4.47* 
Challenging 16.5 7.9 


Note—WN = 20 in each subgroup. 
*p < 001. 


the uncommon words than under either of 
the other conditions (Table 1). A simple 
analysis of variance showed that for old 
Ss the difference between the experimental 
groups was highly significant (F = 11.17, 
df = 2/57, p < .001). In addition to the 
significant F ratio, ¢ ratios were calculated 
to determine more specifically the locus of 
differences in the experimental groups. A 
comparison of the mean scores of the older 
Ss between the neutral versus challenging 
and the supportive versus challenging m- 
structions yielded significant ¢ values of 
2.8, p < .01 and 4.6, p < .001, respectively. 
The performance of older people appeared 
to be at its best in a supportive situation, 
but at its worst in a challenging one. 

The older Ss not only required more 
trials to reach the learning criterion but 
displayed greater variability in their per- 
formance than did the younger Ss. Dif 
ference scores were obtained by subtract 
ing the mean number of trials required 
by the younger Ss from the mean number 
of trials required by the older Ss to reach 
the learning criterion. Table 2 summarizes 
the t ratios of the differences between the 
difference scores of all Ss under the three 
experimental conditions. The difference be- 
tween young and old in the number of 
trials needed to reach the criterion under 
the supportive instructions was signifi- 
cantly smaller than under the challenging 
treatment, t = 4.17, p < .001. However; 
when the difference between the perform: 
ance of both groups under the challenging 
and neutral conditions was analyzed, the 
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value of ¢ did not approach significance 
at the 5% level. 

The elderly Ss took approximately 
twice as many trials to relearn the un- 
common list as the younger Ss and also 
had larger standard deviations over all 
treatments. However, the differences 
among experimental treatments were not 
significant (F = 1.36). There was no sig- 
nificant effect for treatment on the re- 
learning phase of the experiment, nor was 
there an interaction between age and treat- 
ment. Unlike their effect upon original 
learning, challenging and supportive in- 
structions did not appear either to depress 
or to facilitate performance on the more 
difficult tasks. 

When the difference scores between the 
two age groups for relearning the uncommon 
pairs under the three experimental treat- 
ments were compared, there was nO signifi- 
cant differential effect as a function of in- 
structions. 

Although there was & significant dif- 
ference between young and old in the 
mean number of trials needed to learn 
common paired associates (F = 15.64, 
p < 001; 4.97, p < .05; 10.28, p < 
01) under neutral, supportive, and chal- 
lenging instructions, respectively, there 
was no differential effect as a function of 
treatment. Despite the fact that the elderly 
group took longer than the young to master 
the easier task, the performance of older 
Ss was far more adequate on the common 
than on the uncommon list. There was 10 
significant age-related difference for the 
common pairs on the relearning part of 
the experiment. 


Discussion 


Elderly Ss showed a decrement in per 
formance in acquiring the uncommon pairs 
under challenging instructions on the more 
difficult task. The increasing insecurity and 
Susceptibility to stress of aging jndivid- 
uals become particularly evident when 
they are placed in an evaluative situa- 
tion and told that their performance W! 
be compared with that of others. Research 
in anxiety and verbal learning indicates 
that the efficiency of Ss who score high on 
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an anxiety scale seems to be more dis- 
rupted by high motivation or personal 
threat situations than the performance of 
individuals with lower scores (Sarason, 
1960). Sarason (1958) noted that anxiety 
appears more effective in depressing per- 
formance when the material to be learned 
is difficult or complex. He also observed 
that under neutral or reassurance condi- 
tions differences in performance between 
Ss differing in test anxiety have been 
negligible or reversed, a finding which is 
consistent with the result in the present 
study that elderly Ss did best under sup- 
portive and worst under the challenging 
instructions. 

Although the older group required sig- 
nificantly more trials than the young to 
relearn the more difficult material, they 
did not appear to be differentially affected 
by the instructions. Obviously the chal- 
lenging treatment did not have the dis- 
ruptive effect upon performance on this 
phase of the experiment that it did when 
the material was being acquired. It is 
possible that when the material was pre- 
sented for the second time it may no 
longer have seemed novel nor as difficult as 
it did initially. ‘Another explanation may 
be that with increased familiarization the 
older Ss learned to reduce anxiety 
through successful performance of the task 
in the earlier phase of learning. 

While the elderly Ss’ performance on the 
relearning of the uncommon list was poorer 
than that of the younger 8s under all 
treatments, proportionate to their acquisi- 
tion, their rate of relearning was better 


that a larger prop’ 
under the challenging than under the neu- 
tral and supportive conditions made com- 
ments as to their awareness of increasing 
memory loss and “stupidity.” Six of the 
older Ss under the challenging treatment 
as compared with three under the neutral, 
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and none under the supportive instructions 
failed to respond correctly to any items on 
the first learning trial. Although it cannot 
be presumed that anxiety per se was re- 
sponsible for the discrepancy in age groups, 
it does appear that the older person reacts 
with nonadjustive behavior in an evalua- 
tive situation where his performance is to 
be compared with that of others. 

‘The findings firmly support empirical 
evidence as to: age-associated decrements 
in learning. However, motivational varia- 
bles appeared to be most effective in the 
initial stage of learning. If performance 
can be modified in the aged, the implica- 
tions for future work with older persons are 
manifold. Just as new teaching methods 
have facilitated learning in children, so 
new approaches to training and rehabili- 
tation may result in narrowing the per- 
formance gap between the generations. 
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The effects of 2 classes of variables—(a) selected personality and 
motivational factors, and (6) certain task parameters in the form 


ty—on judgments by college age 


students of their college environment were investigated. Contrary 
to the assumptions implicit in most “perceptual” environmental assess- 
ment scales, perceptions and thus descriptions of the college environ- 
ment were not independent of the properties of S—eg., personality 
and motivational characteristics—or properties of the items. Further- 


In recent years a considerable portion of 
educational research has been concerned 
with the study of environmental or situa- 
tional determinants of observed educa- 
tional behaviors. While most of this re- 
search has been limited to the college 
environment, the rationale and procedures 
employed with the college setting are 
readily generalized to other environments 
and other behaviors. Numerous college en- 
vironmental assessment techniques have 
been developed which purport to measure 
in some sense the dominant characteristics 
of the college environment. One popular 
class of these techniques concerns student 
perceptions and cognitions of the college 
environment, perhaps the most notable be- 
ing Stern’s (1958) College Characteristics 
Index (CCI) and Pace’s (1963) College 
and University Environment Scales 
(CUES). In using scales like these the 
typical procedure is to average responses 
from a set of respondents who are consid- 
ered homogeneous with respect to some 
characteristic of interest, and report these 
mean values as describing or profiling the 
particular environment studied. 

If a given environment is treated as & 
constant set of stimuli at any given point 
in time, then variability in response 
this set of stimuli should be attributed to 
random error or to selected characteristics 
of the respondent or items. Foremost 
among such characteristics of the respond- 


ent one might. specify stylistic variance— 
such as responding in @ socially desirable 
or acquiescent manner—reliable personal- 
ity differences, or differences in the percep- 
tion of and the meaning attributed to the 
stimuli. Alternatively, it may be that with 
a complex environment the respondents 
are not all attending to the same elements 
within the set. Inequality of subsets of 
cues, even if they overlap, could lead to 
different responses item. 


the expected value must be treated as error 


ich is to be minimized for ef- 
fective assessment (Torgerson, 1962). 


le of 611 fall term 1965 Georgia 
Institute of Technology freshmen, the range 
of item variances for the 300 items of the 
CCI was .009-.250, with the maximum pos- 
The median item variance 


tions of 4 
this rather large median ite! 
gests considerable I 
scores, and thus considerable lack of uni- 
formity in assessing characteristics of the 
environment. Pace (1968) aptly a 

this case when he said“... what is really 
characteristic of the school is that the stu- 
dents disagree characteristics! 


[p- 37)” 


about its 
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Where substantial item variances obtain 
on a college environment scale, a major 
problem is encountered in deciding what 
is the most appropriate method of scoring 
the scale. A more important and intriguing 
question, one affecting both the substantive 
development and construction of such 
scales, concerns whether this variance can 
be accounted for by selected characteristics 
of the respondents and the items them- 
selves, rather than the environmental as- 
pect sampled. 

Two item characteristics of immediate 
appeal in this respect are item content and 
item ambiguity. Items comprising most 
“perceptual” environmental description 
scales vary widely along a continuum from 
high to low cue determinancy. Some items 
appear unambiguous in that they are eas- 
ily verified by scanning the environment, 
for example, “there are no fraternities or 
sororities.” Responses to items describing 
an ambiguous aspect of the environment— 
for example, “personality, pull, or bluff get 
students through many courses’”—seem 
particularly susceptible to the effects of 
certain perceptual or personality processes. 
For example, a student who is abasing, de- 
pendent, and fearful of his academic per- 
formance, regardless of his ability, may 
be quite defensive in response to items like 
the one cited relating to getting through 
courses. 

Another relevant aspect of this problem 
is the familiarity or sampling dimension. 
Extrapolating from perceptual learning 
studies (Wohlwill, 1966) one might expect 
a change in college student perceptions 
toward increased veridicality as a result 
of increased sampling or familiarity with 
the environment. Again, however, certain 
properties of the individual might be tied 
to parameters of this function as when 
some students are more resistant to percep- 
tual change under conditions of increasing 
input than are others. Associated with en- 
vironmental unfamiliarity, as would prob- 
ably be most true of the entering college 
freshman, one might expect greater response 
uncertainty and the more pronounced in- 

fluence of personality, motivational, and 
attitudinal factors upon item response 


Epmonp Marks 


(Cronbach, 1950; Gage, Leavitt, & Stone, 
1957). 

These comments form the background 
for the hypothesis being examined in the 
present study. Simply stated it is that a 
significant portion of what is presently as- 
sumed to be random error variance in 
scores on a selected college environment 
seale like the CUES can be attributed to 
the nonrandom effects of personality and 
sampling processes as they are elicited by 
selected item characteristics. In particular, 
it is postulated that item ambiguity and 
item content are reliably related to item 
variance, and that under certain condi- 
tions, for example, high item ambiguity, 
these item characteristics lead to the in- 
creased effects of selected personality and 
sampling variables upon item response. 

An important question encountered in 
testing this hypothesis is “...on what ba- 
sis should item ambiguity be defined?” One 
possibility is to define it in terms of expert 
judgment of the visibility of the environ- 
mental characteristic represented by the 
item. Another possibility is to define item 
ambiguity as the uncertainty experienced 
by S in making an item response. The first 
definition clearly reflects the inherent or 
‘Gdeal” ambiguity of an environmental 
cue, that is, that ambiguity of a cue that 
remains even after the respondent has had 
some experience with the environment. The 
latter definition is more sensitive to the 
perceptual and judgmental processes pe- 
culiar to the individual respondent, and in- 
dicates, to a greater extent, his unique 
sampling tendencies. These two approaches 
to defining ambiguity differ also in an im- 
portant theoretical respect (Spence, 1944). 
Ambiguity defined as response uncertainty 
is primarily a response-inferred construct 
being tied almost entirely to S’s responses, 
whereas expert judgment is more nearly & 
stimulus characteristic. In this study, 4 
measure of each approach was included; 
the measure based on expert judgments 
being called “judged item ambiguity, 
while the measure based on student certi- 
tude responses was called “jtem-response 
certitude.” f 

The evidence relating to S correlates - 
responses to perceptual environmental sealé 
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js somewhat sparse (Herr, 1965; McFee, 
1961; Saunders, 1962). After an extensive 
factor analysis of the CCI and its compan- 
ion personality scales, the Activities Index 
(AI), Saunders concluded that the scale 
scores of the environmental measure were 
independent of the personality of the re- 
spondent. This conclusion was based upon 
the finding that, in general, the vectors de- 
fining each index spanned a unique sub- 
space of the total factor space. That is, the 
total number of interpretable factors ob- 
tained by factoring the CCI and AI to- 
gether could be broken down into two sets 
of factors, one group “that are loaded 
mostly by CCI variables and a second 
group ... loaded primarily by AT varia- 
bles” (Saunders, 1962, p. 8). These data 
were not, however, completely “clean,” 
there being some confounding of factor 
structures, Using the same scales, McFee 
arrived at the same conclusion. In contrast, 
Herr in studying the High School Charac- 
teristics Index obtained significant rela- 
tionships between scores on this measure 
and certain ability and biographical vari- 
ables. Also of interest for the present study 
was Saunders’ and Herr’s finding of a 
rather substantial error variance for the 
environmental measures employed. 


Mernop 


Measures Employed 


Environmental measure. The college environ- 
mental scales selected for study were the 
(Pace, 1963). This inventory consists of 150 items 
which are broken down into five nonoverlapping 
scales of 30 items each. The five scales, labeled 
practicality, awareness, community, propriety, an' 
scholarship were defined on the basis of a factor 
analysis of the intercorrelations among the means 
of the 30 CCI scales for a sample of 50 colleges 
and universities. Items for the were then 
selected from the 300 CCI items in terms of how 
well a given item defined one of the five scales. 
_ Subject variables. The personality and motiva- 
tional variables used in this study were selecte 
in terms of their hypothesized relationships with 
the content of the CUBS items as defined Py 
Pace (1963). Nine of the personality scales were 
drawn’ from the 22 scales of Jackson's (1905) 
Personality Research Form, Form ‘A, each of 
which contains 20 items. The scales selected were 
achievement, affiliation, autonomy, cognitive 
structure, dominance, order, social recognition, 
Succorance, and understanding. The test-retest 
teliabilities reported by Jackson (1965) for these 
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nine scales ranged from .73 for cognitive structure 
to 88 for dominance. Two 10-item scales devel- 
oped by Marks and Messersmith (Marks, 1967) 
relating to motivational aspects of educational 
behavior were also included. The two scales were 
level of educational and career aspiration and 
fear of failure, whose split-half reliabilities were 
89 and .80, respectively. 

To evaluate differences due to cue sampling, 
the student was asked to indicate on a 5-point 
seale, the amount of information he had about, 
or how familiar he felt he was with, the college 
environment. Finally, each student was asked to 
indicate again on a 5-point scale, how certain he 
was when responding to a given CUES item, of 
the accuracy of that response. This variable was 
referred to as “item-response certitude” in the 
analysis. 


Procedure 


which was difficult to verify perceptually or for 
which the stimulus cues would tend to be vague, 
subtle, or conflicting.” Each rater was asked to 
read the entire list of items once before rereading 
them for categorizing purposes. Items for which 
there was less than a 4 to 1 agreement were de- 
leted from that part of the analysis relating to 
ambiguity. This variable was denoted “judged item 
ambiguity.” : 

The 150 CUES items were grouped according 
to content in terms of the five first-order factors 
reported by Pace (1963). The content of an item 
was defined simply by Pace’s description of the 
scale to which the item belonged. 


carrying these out, however, the dependent varia- 
ble was transformed in order to more clearly re- 


ance. Since the major concern in this study was 
eir cone nies the ge 

rtion endorsing each item was tr ‘ormed 80 
aie within the range .50 <ps 
ished by setting ? > .50 equal 


and variance of the pbinomially distributed C 
items are inversely related under this transforma- 
tion, p values tending towards 50 jndicate in- 

ditem variance. vie 

othe ian parameters—ambiguity, content, re- 
certitude, and proportion responding—were 
Jated and their means and standard de- 
ted. Those correlation estimates 
mntent represent con co- 
i ile all others are product-moment cor- 
efficients, while all o ee 


ions. In. 
ee and intercorrelations among the 12 8 


variables were computed. 
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Examination of the joint distributions of a 
given CUES item and the S variables indicated 
that, in many cases, the variables did not form a 
bivariate normal density. Because of this condi- 
tion it was decided to examine the relationship of 
CUES item response to the selected S variables by 
testing for differences between the cumulative dis- 
tributions of a single S variable for the two item- 
response categories—true or false. Should a tela- 
tionship exist, scores for one of the categories 
would be expected to shift toward higher values. 
For this purpose a test due to Kolmogorov and 
Smirnov was used (Siegel, 1956). 

As previously indicated, only those personality 
and motivational variables which were suspected 
on the basis of the congruence of variable and 
item contents of being sensitive to the hypothesis 
of a significant item-variable correlation were se- 
lected for study of their relationship to a given 
CUES item. Scholarship items were related to 
achievement, level of aspiration, and fear of fail- 
ure; propriety items to cognitive structure, domi- 
nance, and order; community items to affiliation, 
autonomy, and succorance; practicality items to 
affiliation, order, and social recognition; and fi- 
nally, awareness items to the single variable of 
understanding. The items were also related to re- 
ported familiarity with the environment. 

To facilitate the analysis, the number of CUES 
items examined was reduced by systematically se- 
lecting a smaller number of items which would 
permit evaluation of the hypothesis of an inter- 
action of item-S parameters. The 150 CUES items 
were cross-classified in a 6 X 5 table defined by 
item content and mean item-response certitude, 
and a total of 25 items selected by randomly 
choosing one item from each cell. Since the two 
item parameters were correlated in the sample, 
sampling of the items was not uniform over the 
30 cells of this table. 


Subjects 


The Ss were 570 male freshmen entering Georgia 
Institute of Technology in the fall term of 1966. 
The tests were administered during the week prior 
to registration. ' 


REsvits 


The means, standard deviations, and in- 
tercorrelations of the four item variables 


TABLE 1 
Means, Stanparp Deviations, AND InTERCOR- 
RELATIONS AMONG p, AMBIGUITY, 
CERTITUDE, AND ConTENT 


Item | Ambiguity | Certitude |Content®| N | M | SD 
Pp —.05 (ns) -58* 42° 150 | 74.8 | 17. 
Amb —-08 (ns) | .28 (ns) | 139 | 2.9 "3 
gui 
Certi- 47* 150 | 3.5) .6 
tude 

Note.—Since the Content classification had no ordinal proper- 
jen Whomnsaniand efentiard deviation areltiovrspertedsg 

Sp Coatineeney cosa 

p< 05. 
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are presented in Table 1. The values in- 
volving item ambiguity are based on only 
139 cases. Eleven items had to be deleted 
because they failed to satisfy the criterion 
of a 4 to 1 agreement among judges, Be- 
cause of differences in the correlational 
methods employed, conclusions are best 
limited to statements concerning signifi- 
cance, not magnitude. 

In spite of these qualifications it is ap- 
parent that there is no association between 
judged ambiguity of an item and the 
three other item characteristics. The hy- 
pothesized correspondence between judged 
item ambiguity and the students’ cognitive 
and response processes, that is, certitude 
judgments and the proportion selecting a 
given alternative, failed to emerge. Item 
content, on the other hand, was signifi- 
cantly correlated with both these cognitive 
and response processes. An inspection of 
the respective contingency tables indicated 
that these correlations were due primarily 
to two of the five content categories; 
scholarship and awareness. Scholarship 
items tended to have high item-response 
certitude means and high p values or low 
item variances, while the awareness items 
tended to have low mean response certi- 
tude values and p values which tended 
more toward .50—high item variances. _ 

Mean item-response certitude, as an In- 
dex of the indeterminancy the item p0s- 
sessed for the sample of students, aside 
from correlating significantly with item 
content, also correlated substantially with 
the proportion selecting a given alterna- 
tive. Items which were described by the 
sample as eliciting uncertainty as to the 
accuracy of response, tended to have 
high item variances. ‘ 

The means, standard deviations, and m- 
tercorrelations among the 12 personality, 
motivational, and familiarity variables 
are presented in Table 2. 

Although not of direct interest in terms 
of the hypotheses being examined, com 
ment should be made on some of the corte 
lations in Table 2. Quite noticeable is the 
lack of correlation between the students 
reported familiarity with the college ©? 
vironment and the other variables studiet, 
‘At least for this set of variables, students 


RESPONSES TO AN ENVIRONMENTAL Descatprion ScALe 271 
TABLE 2 
Means, STANDARD DEVIATIONS, AND INTERCORRELATIONS Among THE 12 Supsecr VARIABLES 
ih ie Achieve- | Afi | Autor CRRe’ | Domi. Wear of FEV o,car | eam |Suecor | Sand: [Fame 
ment ation | ne struc- i 5 a | ae 
omy ace nance | failure tion fhition | 22¢¢ ing arity 
Achievement —.05 09° 
‘Affiliation —.31* 
Autonomy 
Cognitive structure 
Dominance 
Fear of failure 
Level of aspiration 
Order 
Social recognition 
Sucocorance 
Understanding 
Mu 13.6 4.1 | 8.3 
SD 3.5 3.5 | 3.0 


Note—N = 570, af = 500. 

*p < 05. 
judgments of their familiarity with the in- 
stitution studied, were independent of the 
personality characteristics of the respond- 
ent. 

The intercorrelations among achieve- 
ment, level of aspiration, and understand- 
ing were suggestive of a form of invest- 
ment. in intellectual activity which has 
both motivational and cognitive compo- 
nents. This pattern is consistent with 
Murray’s (1938) treatment of Need for 
Understanding, and perhaps, Tolman’s 
(1951) “placing need.” On the other 
hand, fear of failure, order, and cognitive 


structure were reliably correlated suggest- 
ing that students who are fearful of their 
performance tend to approach their per- 
sonal and situational involvements in a 
cautious and orderly way, thus apparently 
reducing the perceived possibility of sub- 
standard performance. Students higher in 
these traits can be viewed as having diffi- 
culty in handling environmental situa- 
tions which depart from the expected. 

The tests of association between the se- 
lected CUES jtems—cross-classified on 
item content and mean jtem-response cer- 
titude—and the personality, motivational, 


TABLE 3 


Summary or rH TusTs OF ASSOCIATION BETWEEN THE 


Senzctrep COLLEGE AND UNIVERSITY 


ENVIRONMENT SCALES ITEMS AND Supsect VARIABLES 


. Mean Item 
item-response 
certitude 
Practicality 
2-2.5 
a 
asa | Aiiliation 
Base 32. | Social recognition* 
nore.s | Order” 
______| Social recognition 
3.51-4 Oraer ais 
____| Social recognition 
4045 | Order” 
Biietyre ni | Social recognition 
asi-s | Ofitiation 
_____| Social recognition 


content 


% of tests 


53 


% of tests si 


Note—y = 
p< Ob, 570. 
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and environmental familiarity variables 
are summarized in Table 3. Within each 
cell—corresponding to a single CUES item 
for a given level of mean item certitude 
and content class—the selected S variables 
are listed and the significance of the 
item-variable association noted. 

The notion that personality and moti- 
vational variables are related to item 
response on an environmental assessment 
scale appears supported by the data; over 
30% of the relationships tested were sig- 
nificant at the .05 level. This conclusion is 
offered cautiously, since the test criteria 
are probably not independent. In addition, 
the association between item response 
and the respective S variables appears to 
be moderated by the two item parameters 
studied. Scholarship items were, in most in- 
stances, significantly related to all three of 
S variables hypothesized to be relevant to 
this content class. Similarly, three of the 
six awareness items were significantly re- 
lated to understanding, while community 
items appeared related to both affiliation 
and succorance. The results for the prac- 
ticality and propriety classifications were, 
however, much less indicative of a relia- 
ble effect of personality and motivation 
upon responses to the CUES items. 

Once again, familiarity failed to emerge 
as a correlate of response variability. 
Students’ reports of their familiarity with 
the Georgia Institute of Technology en- 
vironment bore little relation to the vari- 
ability of their judgments of its charac- 
teristics. 

Despite the caution noted concerning 
overall tests of significance, the results of 
this part of the analysis provide rather 
good evidence that responses to some items 
of the CUES are dependent upon certain 
characteristics of Ss and the items. 


Discussion 


The results of the present study, par- 
ticularly those relating to S correlates of 
the CUES item variance, are perhaps best 
treated as providing reliable but limited 
evidence for the presence of nonenviron- 
mental factors in the response to the items 
of a selected environmental assessment in- 
strument, They are neither exhaustive of 
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the possible relationships that might exist 
between these two domains, nor do they in- 
dicate the magnitude of the effects of such 
nonenvironmental variables upon item re- 
sponse. What these results do indicate is 
that for some of the selected S and item 
characteristics studied, a reliable portion of 
the response to a given environmental char- 
acteristic can be attributed to certain 
properties of S. Since it is rarely the intent 
of the constructor of environment scales 
to provide for an S component of the item 
variance, this reliable component must be 
incorporated in the error variance. For 
some CUES items what is really being 
characterized to a great extent is the sam- 
ple of students—not the environment. 

Of particular interest is the lack of as- 
sociation between the reliable judgments of 
the CUES item ambiguity and the stu- 
dents’ reported item-response certitude. 
Students apparently develop, through some 
undefined mechanism, a set of stable per- 
ceptions and cognitions about the environ- 
ment to which they are responding which 
is independent of the number and clarity 
of the environmental cues available. Given 
an item like, “There is a lot of apple-pol- 
ishing around here,” where one might sus- 
pect the environmental cues to be vague 
and poorly defined, one nonetheless, finds 
a very low endorsement value—p = .04: 
This raises the important questions of how 
are stimulus cues utilized by the student in 
making an environmental judgment, and 
second, how are these environmental per- 
ceptions and cognitions formed. Further- 
more, although these perceptions and cog- 
nitions are consistent in that they are 
shared by the sample as a whole, there 18 
the question of the veridicality of such 
judgments. It is doubtful whether items 
tapping an environmental aspect of hi 
cue indeterminancy can reflect a uniform 
property of the environment, or lead 
high consistency of response. As suggested 
later, a part of this response consistency, 
where environmental cues are vague 0 
conflicting, might be attributable to 8 
lected personality and need structures 0. 
the student, A student who perceives his 
college academic environment as highy 
rigorous and demanding is unlikely 
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engage in the dissonant response of endors- 
ing the “gpple-polishing” item, regardless 
of the nature and number of cues avail- 
able on this environmental characteristic. 
‘As hypothesized, certain item parameters 
were related to the variances of the CUES 
items, and equally important, they inter- 
acted with certain of S variables in deter- 
mining response variability. In particular, 
certain of the factor analytically defined 
content classes were related to S uncer- 
tainty and item variability. This, in itself, 
may be a function of the institution or en- 
vironment being studied. At Georgia In- 
stitute of Technology, for example, the em- 
phasis upon academic achievement and 
competition, and the rigorous pursuit of the 
acquisition of knowledge is quite noticea- 
ble; this set of cognitions being shared by 
the students, faculty, and administration. 
This particular perceptual system—hest 
described by Pace (1963) in terms of 
scholarship—provides considerable uni- 
formity of response, high response certi- 
tude, high p values, and low item vari- 
ances, On the other hand, the content area 
labeled awareness by Pace (1963), and de- 
scribed by him in terms of reflectiveness, 
self-understanding, interest in human wel- 
fare, and in general, a concern for “per- 
sonal, poetic, and political” meaning, is 
much less clearly articulated at Georgia 
Thstitute of Technology. In this area the 
Georgia Tech student apparently has fewer 
and more poorly defined cognitions upon 
which to base his responses. In addition, 
the elements of this content class prob- 
ably have less subjective utility for the 
Georgia Tech student during this interval 
of his life. The low response certitude 
Mean values and response proportions 
tending towards .50 reflect this lack of @ 
Perceptual and cognitive frame with re- 
Spect to this dimension. 
_ As intimated earlier, responses to some 
items of the CUES are reliably related to 
selected personality and motivational vari- 
ables, with these relationships being moder- 
ated by item content and the mean re 
sponse certitude associated with the item. 
The environmental area where these effects 
Sie most pronounced was scholarship. 
tudents high in achievement, level 0 
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aspiration, or fear of failure were con- 
siderably less variable in their response to 
items drawn from this content class than 
were students who obtained lower scores. 
Apparently, students who score high on 
these traits have a greater need to perceive 
their environment in a particular way and 
are more greatly affected by an environ- 
mental cue—in this case an item deserib- 
ing the environment—which is discrepant 
with their environmental expectations. 
This would be particularly true for items 
describing the environment which elicit 
considerable uncertainty among the re- 
spondents. In this case, the student—by 
not having a well articulated perceptual 
and cognitive frame for evaluating the 
given environmental cue—tends to rely 
more heavily on his personality or motiva- 
tional domains as determinants of his 
CUES item response. It seems reasonable 
that under these conditions—that is, high 
environmental uncertainty and high need 
—the student will emit a CUES response 
that conforms to or js congruent with his 
particular need structure. As such, greater 
uniformity of response—that is, low item 
variances—may be viewed as reflecting an 
attempt by these students to maintain a 
congruence between the kinds of en- 
vironmental supports they seek, and their 
perceptions of certain environmental in- 
puts as implied in the items. Under this in- 
terpretation, for example, a student high in 
who perceives himself as a 
hard worker who earns everything he gets, 


is less likely to engage : \ 
of endorsing an “4P~ 


roducing response } 
ple-polishing” or “personality, pull, and 
bluff” type item, or any other item in- 


volving achievement by means of duplic- 


ity. : 
sa similar interpretation, utilizing con- 
gruence between personality or need states 
and environmental input as the mec! a- 
nism underlying jtem response, can be of- 
fered for the other content classes where 
significant associations were obtained. 
Here again the perceptual needs of the 
students as defined by their scores on the 
personality and motivational variables 
are left intact by the highly selective re- 
sponse tO the environmental items. This 
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interpretation is obviously consistent with 
and draws heavily from the dissonance and 
imbalance theories of Festinger (1957), 
Heider (1958), and others. 

To this point, the authors have been 
stressing the role of item content as moder- 
ating the relationship of the selected per- 
sonality and motivational variables to item 
response. Mean item-response certitude also 
tends to serve a similar function, although 
its effects are less pronounced than item 
content. A part of this attenuation may be 
tied to the increasingly restricted range of 
CUES item responses as mean item-re- 
sponse certitude increases, Nonetheless, 
both ‘the certitude an S ascribes to the ac- 
curacy of his item response and item 
content are related to the nature and mag- 
nitude of the correlations between the per- 
sonality and motivational variables, and 
item response. 

A final statement should be made con- 
cerning the relationship between the mag- 
nitude of uncontrolled or error variance in 
item scores, and the method of factor anal- 
ysis employed in constructing and inter- 
preting the CUES scales. The irrelevant 
personality and motivational factors 
demonstrated in this study, serve to in- 
crease both item and seale variances. With 
scale scores defined on a fixed interval— 
in this case from 0 to 30—this increased 
scale variance has the effect of pulling the 
institution means closer together. Although 
factoring group means would appear to 
disregard S differences, focusing rather 
upon institutional differences, it is apparent 
that S differences reemerge by attenuat- 
ing group mean covariances, Unreliability 
of the group scale means—ag reflected in 
the scale variances and differentially con- 
tributed to by the personality and motiva- 
tional factors—must be considered when 
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using a procedure like that employed in 
constructing the CUES. 
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Source AND Direction oF CAUSAL INFLUENCE IN 
; TEACHER-PUPIL RELATIONSHIPS! 


ALBERT H. YEE 
University of Wisconsin, Madison 


To study the source and direction of influence in teacher-pupil rela- 
tionships, attitudes of teachers and their intermediate-grade pupils 
were measured early in September and several months later. Sample 
included 102 teachers with middle-class (MC) pupils and 110 teachers 
with lower-class (LC) pupils. The MTAI and a semantic differential 
measured teachers’ attitudes; a 100-item About My Teacher inventory 


pupils and their teachers have more mutual influence relationships ; 
(c) greater differences between levels of pupils’ social class than 
between levels of teacher experience; and (d) serious need to im- 
prove teachers’ relationship with LC pupils. 


Teachers differ in their attitudes of 
warmth and permissiveness toward pupils. 
Classes differ in the favorability of pupils’ 
perceptions toward their teachers. The two 
kinds of differences have consistently been 
found to correlate about .2 to .6 in the up- 
per elementary grades (Getzels & Jackson, 
1963, pp. 508-522); that is, warm teachers 
tend to be found in classes whose pupils 
like their teacher. 

When a correlation occurs, the question 
of causality may be raised. In _teacher- 
pupil attitude relationships, do the teacher’s 
attitudes cause the class’ favorability to- 
ward their teacher? Or does it work the 
other way around, so that friendly pupils 
make the teacher become warm and. per- 
missive? Theories and studies of social 


1 Appreciation is expressed to N. L. Gage who 
ine freely of his advice and eneouragement 
erotenout the project. The research repo! 
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u 0.077 with the United States Department of 

ealth, Education, and Welfare, Office of Edu- 
Bee under the provisions of the Cooperative 
‘ search Program. A Hogg Foundation grant 
jbberted essential preparations prior to e 

A of the second year’s work. Preliminary drafts 

vane article were written during the author’s 
fee sponsored postdoctoral research training 
pl avetin at the University of Oregon. Com- 
i ion of the work was partially supported by & 
want from the University Research Committee 

ate cTedtiate School, University of Wisconsin, 

mn. 


interaction portray the teacher-pupil_ re- 
lationship as complex and reciprocal (Bush, 
1954; Della Piana & Gage, 1955; Flanders, 
1965; Gage, Runkel, & Chatterjee, 1963; 
Ryans, 1960; Smith, 1960; White & Lip- 
pitt, 1960; and Withall & Lewis, 1963, 
pp. 708-710). Thus, the direction of in- 
fluence in teacher-pupil relationships merits 
investigation. 

Earlier studies (Heil & Washburne, 1962; 
Hoyt & Cook, 1960; Rabinowitz & Rosen- 
baum, 1960) have indicated that teachers’ 
attitudes of warmth and permissiveness 
vary with years of teaching experience. 
Preservice and beginning teachers’ scores 
on the Minnesota Teacher Attitude In- 
ventory (MTAI) are considerably higher, on 
the average, than those of teachers wit 
some experience (Beamer & Ledbetter, 
2 years of teaching, MTAI 
scores become stabilized at about the level 
found prior to teacher preparation. Such 


i tion with pupils 
fue passage of time; thus, Day (1959) 
found that graduates who prepared for 
put did not enter teaching shifted less in 
attitudes than did those that entered teach- 


igs this evidence concerning the rec- 
ity of teacher-pupil relationships, 
the decline of warmth and permissiveness 
during the first years of teaching, and the 
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role of contact with pupils rather than mere 
forgetting in this decline, one arrives at the 
following questions: (a) What is the direc- 
tion of influence—from teachers to pupils 
or from pupils to teachers? (b) Is influence 
from pupils to teachers found more often 
in the classrooms of beginning teachers 
than in those of experienced teachers? * 
Finally, since teacher warmth may be 
especially significant for lower-class pupils 
(Gage, 1965), (c) does the direction of 
influence differ according to the pupils’ 
social-class background? 

To answer these questions, a 2-year study 
was conducted, starting in 1964 (Yee, 1966). 
Preliminary analyses indicated the need to 
distinguish between the direction as well as 
the source of interpersonal influence. Thus, 
for the direction of influence, we consider 
influence to be congruent when attitudes of 
source and influence become more positively 
correlated from pre- to posttest occasion 
and incongruent when attitudes become 
less positively correlated over time. With 
these distinctions, the following hypotheses 
were tested for total sample, three levels of 
teachers’ years of teaching experience, and 
two levels of pupils’ social class in one- and 
two-factor analyses. 


Hypotheses 


Hi: Teacher-class pairs showing teacher 
influence toward either congruity (TC) or 
incongruity (TI) are more frequent than 
those showing pupil influence toward either 
congruity (PC) or incongruity (PI), that is, 
TC +TI > PC + PI. 

Hy H TC > PC. 

BST SPI. 


Mertsop 


Instruments 


Teachers’ attitudes were measured with (a 
the MTATI and (6) a semantic differential wc 
good, Suci, & Tannenbaum, 1957) prepared for 
this study with My Class as the concept and 
17 bipolar adjectives highly loaded on the evalu- 
ative dimension. 

_ Pupils’ attitudes were measured with the 100- 
item About My Teacher (AMT) inventory (Beck, 


8 For suggesting teacher experience as a factor 
in our investigation, the author is indebted to 
R. L. Debus, University of Sydney. 
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1964) developed on five dimensions of teacher’s 
merit: affective, cognitive, disciplinary, inno- 
vative, and motivational. This inventory yielded 
a total score (Po, Po’; unprimed symbols indicate 
pretest and primed symbols indicate posttests) 
and 11 subscores obtained on the basis of multi- 
ple-factor analyses (principal axis, rotated by 
Kaiser’s Varimax method’) of the mean pupil 
ratings of their teachers. Identical or very similar 
factors were extracted from separate analyses 
of the middle-class and lower-class pupils’ re- 
sponses. 

The factor analyses provided the basis for 11 
measures of dimensions of pupils’ perceptions of 
their teachers: 

P, : popularity and effectiveness in instruction 
(17 items, such as “Do you think your teacher 
understands people your age? Does your teacher 
make sure everybody understands the teacher?”’) 

P; : personal popularity and warmth toward 
children (6 items, such as “Does your teacher 
seem to like children?”’) 

P; : irritability and moodiness (3 items, such 
as “Is your teacher often cross?’’) 

P, : explaining ability and communication, as 
measured with negatively stated items (9 items, 
such as “When you ask your teacher a question, 
do you often just get more confused?”’) 

P; : explaining ability and communication, a8 
measured with positively stated items (8 items, 
such as “Does your teacher make difficult things 
easy to understand?’’) 

Ps : effectiveness in developing an atmosphere 
of responsible pupil conduct, or pupils’ per- 
ceptions of their own orderliness (9 items, such as 
“Do the children behave well for your teacher?”) 

P; : disciplining behavior (3 items, such as 
“Does your teacher succeed in keeping the pupils 
under control?’’) 

Ps : innovative behavior, tendency to use audio- 
visual materials and field trips (3 items, such as 
‘Does your class go on field trips that help you 
understand what you are studying?”’) a 

Py : innovative behavior, tendency to individu- 
alize instruction in the choice of materials an 
methods (3 items, such as “Do all the pupils in 
the class use the same book at the same time? 

Pio: motivating pupils’ interest and en 
thusiasm, as measured by positively stated items 
(6 items, such as “Does your teacher make you 
feel like working real hard at your school work?”) 

Pi: motivating pupils’ interest and enthust- 
asm, as measured with negatively stated items 
(3 items, such as “Is your teacher making schoo! 
work less interesting for you this year?””) 1 

For measures of teachers’ attitudes, the tot# 
MTAI scores were supplemented by three fff 
factors extracted by Horn and Morrison (196 ih 
Factor I, Traditionalistic Versus Modern te é 
about Child Control (Ti, T1’); Factor I, Un 


p 

3 With computer program by D. J. Veldman a 
file at the Computation Center, The University 
of Texas. 
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Versus Favorable Opinions about 
Children (T2 , T,’); and Factor II, Punitive 
Intolerance Versus Permissive Tolerance for 
Child Misbehavior (Ts , T;’). 


favorable 


Sample . 

Data for this study were secured from 102 
teachers and their pupils in 32 schools of middle- 
class (MC) neighborhoods located in the San 
Francisco bay area and central Texas (in Grade 
4,n = 83; Grade 5,n = 36; Grade 6, n = 33) and 
140 teachers and their pupils in 18 schools of lower- 
class (LC) neighborhoods located in central 
Texas (in Grade 4, n = 39; Grade 5, n = 38; Grade 
6,n = 31; Grade 7, = 2). Recruiting procedures 
conducted with school administrators created no 
known systematic bias in the sample selected. 
The sample obtained is believed to be a fair 
representation of LC and MC pupils and their 
teachers, even though rigorous random selection 
was not possible. 

Social-class status was determined by consul- 
tation with school administrators and informal 
inspection of neighborhoods. Family income 
($4,000 or less annually for lower class; $6,000 
or more for middle class) and father’s occupation 
(blue collar and unskilled for lower class; white 
collar and professional for middle class), a8 as- 
certained from school administrators, were the 
main criteria for establishing social-class status. 

For analyses to be discussed, subsamples were 
classified by pupils’ social-class background and 
teachers’ years of experience, as follows: for LC, 
0-1 years (five with 1 year’s experience), m = 25; 
2-8 years (average of 4.6 years), n = 36; 9+ years 
(average of 19.9 years), n = 49; and for MGC, 0-1 
years (four with 1 year’s experience), ” = 39; 
2-8 years (average of 5.5 years), n = 31; 9+ years 
(average of 17.2 years), n = 32. 


Procedure 


The frequency-of-change-in-product-moment 
(FCP) technique was developed to tabulate each 
teacher-class unit under one form of teacher or 
Pupil influence, that is, TC, TI, PC, or PL This 
technique and others are more fully discussed in 
Yee and Gage (1968). 

a The following procedures were originated to 
ae frequencies for chi-square tests of signifi- 

1. Raw scores of teachers’ and pupils’ attitudes 
(class means) were converted to standard scores. 
pur the nature of or direction Of influence— 
congruent or incongruent—was determined by 

ing if cross-products of posttest 2 scores Wor 
More positive or negative than cross-products of 
Pretest z scores. If the cross-product of posttest 

8, 2ty2py, Was more positive than 2Ts#Pa» 
were of influence was said to be congruent, 
ti at is, the relationship between the teacher and 
er class helped make the overall correlation more 
pelaye. Tf the cross-product of posttest 2’s was 

ore negative, the direction of influence W9S calle 
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incongruent, that is, the relationship between the 
teacher and her class helped make the overall 
correlation more negative. This manner of as- 
sessing direction of influence is logically con- 
nected with the basic formula for product-moment 
correlation coefficients, that is, r = 2zxzy/N — 1. 

3. The source of influence was determined by 
taking cross-lagged z products, 2r@Pq' and Zpp2tn’- 
When direction of influence was congruent, the 
more positive product was classed as source, that is, 
it helps to increase the cross-lagged correlation 
where effector’s z score is from pretest occasion and 
z score of party influenced is posttest. When 
direction of influence was incongruent, the more 
negative product was classed as source, that is, 
it helps to increase the cross-lagged correlation 
where effector’s z score is from posttest occasion 
and z score of the one influenced is pretest. 

The three hypotheses were tested with the 
general computing formula for chi square (Guil- 
ford, 1965, p. 230). The hypotheses as stated call 
for a directional or a one-tailed test of significance; 
therefore, the .05 level of significance requires a 
chi-square value of at least 2.71 with 1 df. In com- 
puting a chi-square value, Yates’ correction for 
continuity (Guilford, 1965, pp. 237-239) was 
applied to the frequencies. 


ResunTs AND Discussion‘ 


Reliability, Rectilinearity, and Stability 


Coefficients of internal consistency for 
the responses to both teacher inventories 
were computed with the Guttman (1945) 
I, formula. The coefficients for the My 
Class inventory and the MTAI were all in 
the high .80’s. t } 

Coefficients of pupil agreement in rating 
their teachers on the AMT were computed 
with Horst’s (1949) formula. The Horst r 
for the total score (Po) obtained at pretest 
was .86 for the 110 classes of LC 
and .89 for the 102 classes of MC 
for total scores at posttest, 

‘91. When separate Horst 

rs were 88 and sete 80 
these 1's averaged .82 for the 110 
soothers of LC pupils and .85 for the 102 


of MC pupils. i ’ 
tear otiineatity of the uae pen 
i teachers’ attitudes and pups 
pnvee shige ‘mated by inspection of 


perceptions was k 
over 30 scatter-plots machine drawn for a 


dW. E, Geeslin pr: vided 
rt Napier and W. %- eeslin provide 

ae ee ing aaa computer assistance for this 
study with the help of Ji ‘anice Willenborg. 
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random sample of r’s. No curvilinear rela- 
tionship was found. 

Stability coefficients were higher for the 
measures of teachers than those of pupils. 
For the total sample (V = 212), trw = 
79 and Trpy = .68. As expected, teachers’ 
measures increased in stability with teaching 
experience. For teachers of Q-1 years’ 
experience (n = 65), 2-8 years (n = 67), 
and 9-46 years (n = 81), Tv equalled .71, 
.81, and .84, respectively. This regular in- 
crease in MTAI stability with increasing 
teacher experience is also evident when the 
teachers in each social-class group are divided 
according to experience. MC pupils’ atti- 
tudes tend to be more stable than those 
of LC pupils, for example, for LC, rppy = 
.62 and for MC, .71. 


FCP Analyses 


The direction of attitude change when 
found in statistically significant results is 
in general from teachers to pupils, that is, 
teachers’ attitudes cause pupils’ attitudes 
to change more than pupils’ attitudes cause 
teachers’ attitudes to change. The most 
striking FCP results were obtained in 
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analyses of attitude relationships with 
teachers’ Ty and T; scores. 

In general, the attitude means for teachers 
of LC pupils were significantly lower than 
those for teachers of MC pupils (Yee, 1968). 
For example, respective T; means for LC 
and MC pupils’ teachers with 9+ years’ 
experience were as follows: Ty’ = 45, 
SD = 9.80; Ti’ = 6.41, SD = 9.12. The 
low T; scores for teachers of LC pupils 
indicate such teachers possess tradition- } 
alistic and inflexibly negative attitudes 
toward child control. The higher T: scores 
for MC pupils’ teachers show a more per- 
missive, positive, and flexible attitude 
toward controlling children. Thus, analyses 
with this teacher attitude indicate great 
contrast between LC and MC subsamples 
and many instances of teacher dominance 
over LC pupils in attitude relationships. 

Results of analyses with other teacher 
measures indicate the direction of influence 
to be from teachers to pupils in the pre- 
ponderance of frequencies favoring teacher 
influence, but fewer statistically significant 
results were found with them than were 
found with T, and Ty. Since T: was the 
primary factor extracted by Hom and 


TABLE 1 
CorRELATIONS AND FRequenctms-or-CHANGE-In-Propuct-Moment Rusuits ror THACHER 
VaRIABLE (T;) AND Pupit VARIABLE (Ps) 


: ; Correlations Frequencies _|Chi squares for hypotheses? 
aE I erica eee Foe DT SSIS DSS RST FS 
N | oryty | rey | rnp, | tyes | TC] TI} PC| PI] Hs Hs 

SCieeat RY texan [Ob pee |_— 
Both 0-46 212| .79 | .59 00} .06|60| 65 | 50|37| 6.46) .74] 7.15 
LC 0-46 =| 110] .82 | .55 ‘14] 116 | 37 | 38 | 23 | 12 | 13.38 | 2.88 | 12.0 
MC o-41 | 102| .74 | 164 | —.12| —.05|25|26|30| 21]. .o1| .29) -3 
Both o4 64| 71 | 168 | —.14| 102] 21| 15] 14] 14| 77 | 1.03] _ -08 
Both 28 67| .75 | .53 ; “95 | 20 | 21 | 14| 12| 2.93) .74| 1.98 
Both 9-46 si| 135 | 155 | —.03 | —.05 | 24 | 33 | 15 | 9 | 12.64 | 1.64 | 18.60 
LC 0-1 25| 1 | 167 | .10| .26| 8| 7| 6| 4] -64| -o7| -% 
LC 28 36] .77 | .57 : ‘37 |18| 10] 3| 5| 20.03 | 9.83 | 1.07 
LC 946 49| .83 | .50 ‘or | —.07 | 14/18] 11| 6) 4.00] .16| 5-% 
MC 4 30| :59 | (72 | —124|} —115| 31] 9| 9] 10] .03|) .05| ,-0 
MC 28 31| .75 | 158 |—.08| .05| 4|11|12| 4| .03| .06| 2-40 
MC 941 | 32] 88 | 164 |—108| ‘o1| 7/10] 8| 7| .03| -o7| -% 


Note.—Abbreviated: LC = lower class, MC = middle class, TC grui = teacher 
it Sa OUR ; y 2 = = teacher congruity, TI 4 
incongruity, PC = pupil congruity, PI = pupil incongruity, Hy= Hypothesis 1, Hs = Hypothesis 4 


H; = Hypothesis 3. 


¢H,: TC + TI > PC + PI; Hy: TC > PC; and Hy : TI > PI. Yates’ correction applied t 


0 chi 
d 6.64 


squares; chi square equals 2.71 at .05 level of significance, 3.84 at .02 level, 5.41 at .01 level, ané ?" 
at .001 level, one-tailed with 1 df. Chi squares with p < ‘05 level are in italics. 
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Morrison (1965) and closely resembles the 
primary factor extracted by this writer ina 
preliminary factor analysis of the MTAI, 
T, accounts for more variance between 
teachers and appears to provide more 
salient response consistencies toward chil- 
dren than other teacher measures. 

Table 1 illustrates the tabulation of 
frequencies by the FCP technique. 

In Table 1, significant results show teacher 
influence represented by T; causing pupils’ 
perceptions of teachers’ explaining ability 
(Ps) to shift in congruent and incongruent 
directions. For the total sample, teachers 
cause pupils to shift significantly toward 
incongruity. Across subsamples, there is 
a definite pattern in the direction of fre- 
quencies favoring teacher influence. With 


_ LC pupils, teachers’ influence predominates 


over pupils in both congruent and incon- 
gruent directions. Results of analyses with 
MC pupils indicate little one-sided domi- 
nance by either teachers or pupils. 

One of the study’s few instances in which 
statistically significant results favor pupil 


_ influence can be seen for the MC, 2-8 group 


in Table 1 under He . 


TABLE 2 
Summary or FREQUENCIES-OF-CHANGE-IN-PRODUCT 


Subsamples by Pupils’ Social Class 

Table 2 presents results for the subsample 
with LC pupils and their teachers. The 
asterisks indicate the statistically significant 
chi-square results found. Analyses with 
T, and T; show strong teacher dominance 
over LC pupils, especially in incongruent 
influence. Such results contrast sharply 
with results from similar analyses for MC 
pupils, as shown in Table 3. Only in analyses 
with MC pupils’ attitudes toward teachers’ 
innovative (Ps and Ps) and motivational 
(Pio) merit do predominant results in Table 
3 favor teacher influence. Since innovative 
and motivational pupil attitudes are based 
on perceptions of teacher behavior most 
likely to change over time (see earlier 
description of these pupil factors), such 
results are less significant than if comparable 
results with other factors had been found. 

The great contrast between results in 
Tables 2 and 3 indicate considerable differ- 
ences in teacher-pupil interaction, which are 
reflected in the contrasting attitude means 
where attitudes of the LC subsample are 
generally lower than those found for the 
MC subsample (Yee, 1968). 
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TABLE 3 


Summary oF Frequencizs-or-Caance-In-Propuct-Moment Resvtrs ror THE SUBSAMPLE — 
or 102 Teacuer-Crass Units wir Mippuz-Crass Pupiis 


Hypothesis 1 


Pupil variables 


Hypothesis 2 Hypothesis 3 


3 


piel | 


1 Sle 


Note.—Asterisks indicate significance level of chi-square results favoring teacher influence; no sig- 
nificant results found for analyses with Py , P:, P:, Ps, Ps, and P;. 


Subsamples by Pupil’s Social Class and 
Teacher's Experience 

Beginning teachers. LC pupils taught by 
novice teachers are influenced more than 
MC pupils. In Ty and T, analyses for the 
subsample, five significant chi squares 
supporting Hi were found, two for Hz, 
and seven for H;. In analyses with the 
same attitudes, only one significant result 
was found for the relationships between 
novice teachers and MC pupils; and that 
was for H, with P,. No results favored LC 
or MC pupils’ influence. 

The causal dominance of teachers over 
LC pupils may be more understandable 
with closer examination of results for the 
subsample with LC pupils and beginning 
teachers. It was found that LC pupils’ 
disciplinary (P;) and motivational (Pio) 
attitudes shifted in the congruent direction 
(that is, became more positively correlated 
to teacher’s attitudes), but their affective 
(Pi and P2), cognitive (Py and Ps), and in- 
novative (Ps) attitudes shifted in the incon- 
gruent direction (that is, became less posi- 
tively correlated to teacher’s attitudes). 
Such results indicate that from pre- to 
posttest occasion, interaction became more 
teacher dominated, pupils became more 
conforming, and classroom climate grew 
colder. School became less appealing for 


Teacher variables 
To TT | T: | Tz | C T, Ty T 
) 


the LC student who in acquiescing to the 
negativity of their teacher’s attitudes as 
indicated by pretest scores may have con- | 
tributed to the slight improvement 
their teacher’s attitudes as indicated by 
posttest scores (eg., Ti 4.68, Ti! = 
5.28). 

Experienced teachers. Results for four 
subsamples with experienced teachers, that 
is teachers with 2-8 years’ experience ant 
those with 9+ years’ experience, also indi- 
cated differences between teacher-pupil 
interaction for LC and MC pupils. The 
magnitude and extent of teacher dominance | 
over LC pupils in T; relationships become 
greater as teachers’ experience increases. 

In analyses with T, scores of teostere 
with 2-8 years’ experience and i, 
pupils’ scores, seven significant chi squares 
supporting H, were found: five for us 
and six for H;. For the counterpart M 1 
group, not one result favored teacher 
influence. However, three results inet i 
barely significant at the .05 level favore® 

upil influence. el 

i The contrast between teacher-pupil intel™ 
action for LC and MC pupils becomes on 
greater with teachers of 9+ years’ ba, 
ence. Nine of the 12 FCP results wit! 
senior teachers’ T; measures and LC pupls, 
measures supported H, and 9 supports 
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H, . No significant results for H2 or in sup- 
port of pupil influence were found. The 
predominance of significant results for H; 
indicates considerable incongruent teacher- 
pupil interaction operating between senior 
teachers and LC pupils. Results for MC 
pupils and their senior teachers show only 
one significant chi square, and that is for 
Hi, with pupils’ Py attitude. 


Discussion AND CONCLUSIONS 


When a definite direction of influence in 
teacher-pupil interaction was found in 
FCP analyses, teachers influence their 
pupils much more in schools located in LC 
neighborhoods than those located in MC 
neighborhoods. In schools for LC pupils, 
teachers’ less positive attitudes of warmth, 
permissiveness, and favorability toward 
pupils tended to make pupils’ attitudes 
toward their teacher become more unfavor- 
able. In schools for MC pupils, the teachers’ 
More positive attitudes made less differ- 
ence, that is, had less effect on pupils’ 
attitudes, 

The factor of teachers’ experience ac- 
counted for far less of the variance between 
teachers and classroom interaction than 
pupils’ social class. Between the two levels 
of pupils’ social class, attitude relationships 
of all three levels of teacher experience con- 
trasted greatly; but within the same level of 
Pupils’ social class, levels of teacher experi- 
ence resembled each other. Teacher domi- 
hance over LC pupils appears to become 
More pronounced as teacher experience 
Increases. The more negative attitudes of 
this study’s teachers with 9-+ years’ experi- 
enee working with LC pupils and their 
Mcongruent attitude relationships with LC 
Pupils raise serious questions concerning 
Such teachers’ placement with LC pupils 
and value as teachers in general. 

Perhaps consideration of the pupils’ 
characteristics helps explain these results in 
Part (Riessman, 1962). It may be argued 
that LC pupils have less potent sources of 
adult warmth and support at home and 

nce depend more on, and are influenced 

Y, Such adult influence at school. The more 
Vulnerable self-concept, or weaker ego of 

LC pupil makes him more open to his 
teacher’s influence as a determiner of his 


281 


attitude toward his teacher. The better 
established orientation of the MC child 
toward adults in general, both parents and 
teachers, makes his attitudes toward his 
teacher more stable and less susceptible 
to the influence of the particular teacher he 
happens to have in any given year. 
Nevertheless, the more significant factors 
determining such differences in teacher- 
pupil interaction appears to be in the area 
of teachers’ characteristics rather than the 
area of pupils’ characteristics. The relative 
differences between the pretest attitudes of 
MC and LC pupils’ teachers are greater 
than the differences between the pretest 
attitudes of MC and LC pupils. One ex- 
planation for such differences between 
teachers may be what administrators and 
supervisors emphasize in evaluating teacher 
performance in systems with MC pupils 
and in those with LC pupils. Turner (1965) 
suggested that the greater need to emphasize 
and help overcome pupils’ deficiencies in 
skill subjects as seen by administrators 
and supervisors in schools with working- 
class children creates a “criterion space” 
less concerned with teachers’ affective merit. 
Kliebard’s (1967) critical review of cur- 
riculum strategies for disadvantaged learners 
lends support to Turner’s argument. In 
schools serving MC families, pupils’ defi- 
ciencies in skills and intellectual back- 
und are not major problems, so greater 
ecnpbans is given the personal-social charac- 
teristics of teachers. : . 
Although much has lig said and written 
concerning the pre- and inservice prepara- 
tion apie for work with disadvantaged 
children (e.g., Riessman, 1967), such prepa- 
ration may be for naught unless school 
administrators develop pedagogical and 
employment policies that recognize the 
affective needs of disadvantaged pupils as 
well as their cognitive needs. The practical 
significance of these findings and inter- 
pretations is that the teacher's attitudes 
th and permissiveness are even mo 
toes to LC children that to MC 
children. Zigler and Kanzen (1962) found a 
significant interaction between the type of 
reinforcer used and the social class of 8. 
The praise reinforcers, such _ as “good 
and “fine,” were more reinforcing than the 
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correct reinforcers, such as “correct” and 
“right,” with LC children, while the correct 
reinforcers were more effective than the 
praise reinforcers with MC children. 

The hypothesis that LC pupils are dis- 
criminated against with respect to the 
teachers and classroom environments as- 
signed them appears to be supported by 
this study’s results. Insofar as such teacher 
attitudes can be brought into the class- 
room through selection and training pro- 
cedures, the effort should especially be made 
to place the “better” teachers in schools 
located in LC neighborhoods. 
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“WHAT IS LEARNED” IN 
MATHEMATICAL DISCOVERY: 


WILLIAM G. BOUGHEAD JOSEPH M. SCANDURA 
North Georgia College University of Pennsylvania 


AND 


2 questions were asked: (a) can “what is learned” i i 

discovery be identified and taught by exposition wine peau 
results, (b) how does “what is learned” depend on prior learning and 
on the nature of discovery? The major hypothesis was that discovery 
Ss may discover derivation rules for deriving classes of solutions 
but only when the solutions are not initially known. 4 programs, 
(specific) rule-given (R), discovery (D), guided discovery (G) and 
exposition of derivation rule (EZ) were administered to 7 groups. 1 
group received program R alone; the others received R with 1 of the 
other programs. Both orders of presentation were represented: RD, 
DR A RG, GR; RE, ER. All Ss were required to derive new solutions 
within the scope of the derivation rule. As hypothesized, Groups R. 
and RD performed at 1 level which was reliably (p < .001) below 
the common level of the other 5 groups, Theoretical and practical 


implications were discussed. 


One of the fundamental assumptions 
underlying many of the new mathematics 
curricula is that discovery methods of 
teaching and learning increase the stu- 
dents’ ability to learn new content (eg., 
Beberman, 1958; Davis, 1960; Peak, 1963). 
The last decade of research on discovery 
learning, however, has produced only par- 
tial and tentative support for this conten- 
tion, Even where the experiments have 
been relatively free of methodological 
defects, the results have often been inconsist- 
ent (e.g., see Ausubel, 1961; Kersh & Witt- 
tock, 1962). More particularly, the interpre- 
vo of research on discovery learning has 

een made difficult by differences in termi- 
aie 
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nology, the tendency to compare identical 
groups on a variety of dependent measures, 
and vagueness as to what is being taught 
and discovered. 

While most discrepancies due to differ- 
ences in terminology can be reconciled by a 
careful analysis of what was actually 
done in the experiments (eg. Kersh & 
Wittrock, 1962) and thus present a rela- 
tively minor problem, the failure to equate 
original learning has often made it difficult 
to interpret transfer (and retention) re- 
sults in an unambiguous manner, Thus, 
several studies (¢.g., Craig, 1956; Wittrock, 
1963) have shown that rule-given groups 
perform better on “near” transfer tests 
than do discovery groups. The obtained 
differences, however, may have been due to 
the fact that the discovery groups did not 
learn the originally presented materials as 
well as the rule-given groups. 

When the degree of original learning was 
equated, Gagné and Brown (1961) found 
that their discovery groups were better 
able to derive new formulas than were 
their rule- (ie. formula) given groups. 
They attributed this result to differences 
in “what was learned” but added that they 
were unable to specify precisely what these 
differences were. On the basis of an analy- 
sis of the experimental programs used by 
Gagné and Brown (1961), Eldredge (1965) 
hypothesized that the differences found by 
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Gagné and Brown (1961) were due to un- 
controlled factors. Eldredge conjectured 
that if the treatment differences were lim- 
ited to the order of presentation of the 
discovery hints and the to-be-learned 
formulas, no differences in transfer ability 
would result. However, Eldredge’s results 
contradicted his hypothesis. In subsequent 
studies, Gutherie (1967) and Worthen 
(1967) obtained similar sequence effects. 

Using the Set-Function Language (SFL) 
characterization of a rule as a guide, 
Scandura (1966) proposed an analysis of 
discovery learning that seems to be in ac- 
cord with experimental findings. 


In the SFL, the rule is viewed as the basic unit 
of behavior; associations and concepts are shown 
to be special cases (of the rule). The denotation 
of a rule is defined as a set of functionally distinct 
stimulus-response pairs—the instances of the rule. 
The rule construct itself is characterized as an 
ordered triple (D,0,R) where D refers to the set 
of those stimulus properties which determine the 
corresponding responses, and O refers to the opera- 
tion or transformation by which the derived stim- 
ulus properties or (internal) responses in the set R 
are derived from the properties in D (for more 
details, see Scandura, 1966, 1967a, 1968a, b, and c). 


The main point of the analysis was that 
in order to succeed, discovery Ss must 
learn to derive solutions (i.e., responses) 
whereas solution-given Ss need not. In at- 
taining criterion, discovery Ss may discover 
a derivation rule by which solutions to 
new, though related, problems may be de- 
rived. Under these circumstances, discovery 
Ss would be expected to perform better 
than expository Ss on tasks which are 
within the scope of such a derivation rule. 
If the new problems presented have solu- 
tions beyond the scope of a discovered 
derivation rule, however, there would be no 
Teason to expect discovery Ss to have any 
special advantage. 

This study was concerned with two ma- 
Jor questions. First, can “what is learned” 
in mathematical discovery be identified 
and, if so, can it be taught by exposition 
with equivalent results? Second, how does 
“what is learned” depend on prior learning 
and on the nature of the discovery treat- 
ment itself? 

The SFL was used as an aid in analyzing 
the guided discovery programs used by 
Gagné and Brown (1961) and Eldredge 
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(1965) to determine “what was learned.” 
As a result of this analysis, an expository 
statement of the derivation rule was de- 
vised. It was possible, in the manner de- 
scribed by Scandura, Woodward, and Lee 
(1967), to determine on an a priori basis 
which kinds of transfer item could be 
solved by using this derivation rule and 
which could not. 

Assuming that transfer depends only on 
whether or not the derivation rule is 
learned, then the order in which the for- 
mulas (i.e., the solutions) and the deriva- 
tion rule are presented should have no ef- 
fect on transfer so long as S actually 
learns the derivation rule. If, on the other 
hand, a discovery program simply provides 
an opportunity to discover and does not 
guide the learner through the derivation 
procedure, sequence of presentation might 
have a large effect on transfer. That is, if 
a capable and motivated S is given appro- 
priate hints, he might well succeed in dis- 
covering the appropriate formulas and in 
the process discover the derivation rule. It 
is not likely, however, that he would exert 
much effort when given an opportunity to 
discover a formula he already knows. 
Something analogous may well have been 
involved in the studies by Eldredge (1965), 
Gutherie (1967), and Worthen (1967). 

In particular, the following hypotheses 
were made, First, what was learned by 
guided discovery in the Gagné and Brown 
(1961) study can be presented by exposi 
tion with equivalent results. Second, pres- 
entation order is critical when the hints 
provided during discovery are specific to 
the respective formulas sought rather than 
relevant to a general strategy (i.e., deriva- 
tion rule). Third, presentation order is not 
critical when the program effectively 
forces S to learn the derivation rule, Te 
gardless of whether the learning takes place 
by exposition or by discovery. 


Mertnop 


Materials? 


There were seven treatments. Each consisted 0 
@ common introductory program followe 


2 Copies of the experimental materials ven 
included in Roughead’s (1966) dissertation aD‘ 
Scandura’s (1967b) final report. 
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various combinations of four basic instructional 
programs. The introductory program was designed 
to generally familiarize Ss with number sequences 
and with terminology used in the four basic pro- 
grams. In particular, four concepts were clarified: 
sequence; term value, 7, ; term number, n; and 
sum of the first 2 terms of a sequence, yn 

Each of the four basic instructional programs 
was based on the same three arithmetic series and 
their respective summing formulas: 1 + 3 + 
B+ ee + Qn — 1) > 85} 24+6410+4 +--+ 
(4n — 2) > 2n?} 1 +5+9+--- + Gn—-3)> 
(2n — 1)n. Following Gagné and Brown (1961), 
each series was presented as a three-row display— 
eg. 

Term number 1: 12 3 4 5 

Term value T,:. 2 °6 10 14 

Sum "208, 18) 32 


The rule and example (R) program consisted of 

the three series displays together with the respec- 
tive summing formulas. The presentation of each 
summing formula was followed by three applica- 
tion problems—e.g., find the sum of 2 + 6 + 10 
(= 2-3? = 18). The S was also required to write 
out each formula in both words and symbols, but 
no rationale for the formula was provided. A test 
of the three training formulas was included at the 
end of the R program. 
_ The other three basic programs included differ- 
ing kinds of directions and/or hints as to how the 
summing formulas might be determined. The 
expository (E) and (highly) guided discovery (G) 
Programs were based on a simplified variant of 
that derivation rule presumably learned by the 
guided discovery Ss in the Gagné and Brown 
(1961) study. The identified rule can be stated, 


++ formulas for >)" may be written as the 
product of an expression involving n [i.e., £()] 
and n itself. The required expression in n can 
be obtained by constructing a three columned 
table showing: (1) the first few sums >)”, (2) 
the corresponding values of n, and (3) a column 
of numbers f(n) = >-"/n which when multiplied 
by n yields the corresponding values of >)”. 
Next, determine the expression f(n) = Dy /n by 
comparing the numbers in the columns labeled 
% and 3O"/n and uncovering the (linear) rela- 
tionship between them. The required formula is 
simply >°" = n-f(n). : 


As an example, consider the display, 


Term number n: 12 3 4 52: 
Term value 7: 2 6 10 14 
Sum x": 2 8 18: 32 
The three-columed table would look like, 
f(n) n x 
2 re 2 
4 2 8 
6 3 18 
8 4 2 
Qn z 2nt 
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Me emerging patter if) = 2 0, = 
The E program consisted of a simplified state- 
ment of the derivation rule as it applied to each 
of the three training series. To insure that S 
learned how to use the derivation rule, a vanishing 
procedure was used which ultimately required S 
to apply the procedure without any instructions. 
The G program paralleled the E program in all 
respects. The only difference was that the G pro- 
gram consisted of questions whereas the E program 
consisted of yoked direct statements, each fol- 
lowed by a parallel question or completion state- 
ment to see whether S had read the original state- 
ment correctly. For example, the E statements, 
“When n = 3, you can multiply 6 times n to get 
>* = 18. What times n gives o = 18?” corre- 
sponded to the question, ‘When n = 3, what 
times n gives >)* = 18?” which appeared in the 
G program. Since the degree of overt responding 
was held constant, the only difference between the 
E and G programs was whether the information 
was acquired by reception or by reacting to a 
question (i.e., by discovery). The discovery (D) 
program, on the other hand, simply provided S 
with an opportunity to discover the respective 
summing formulas. The S was guided by questions 
and hints which were specific to the formulas 
involved (e.g., “the formula has a 2 in it’’) rather 
than relevant to any general strategy or deriva- 
tion rule. The questions and hints were inter- 
spersed with liberal amounts of encouragement 
e.g., “Good try,” “‘you can do it,” etc.) to provide 
motivation. We 
There were two transfer tests. The within-scope 
transfer test consisted of two new series displays 
which could be solved by the identified derivation 
rule, These series and their respective summing 
formulas were 3+ 5+7+ +7: + Qa +1) > 
(n + 2)-nand4 +10 +16 + --- + @n—2)> 
(3n + 1)n. The extra-scope transfer test involved 
the series, 2+4+8+ 7 +2"— (27, — 2) = 
(Tn 41 —2) and 1/2+1/6 + 1/124 ++- + [L/n(n + 
DJ n/m +1) = nT, , which, strictly speaking, 
were beyond the scope of the identified derivation 
rule. A series of hints, paralleling those used in the 
D program, were constructed to accompany each 
ries. 
ee introductory and treatment programs were 
mimeographed and stapled together into separate 
534 X 814 inch booklets. The four transfer series 
in a test booklet 
were presented on separate pages 1n 
in the same three-row form used in the learning 
programs. The hints were put on 5 X 7inch cards, 


bound by metal rings. 


Subjects, Design, and Procedure " 
ive Ss were 105 (103 females) junior an: 

pe rbe: aerate education majors, enrolled in 
required mathematics education courses at the 
Florida State University, who volunteered to par- 
ticipate in the experiment. The data of seven other 
Ss were discarded because they failed to meet a 
major premise on which the hypotheses were based. 
That is, they were poorly motivated and/or made 
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a large number of errors on the treatment’ pro- 
grams. . 

The experimental Ss were randomly assigned to 
the seven treatment groups. In addition to the 
common introductory program, the R group re- 
ceived only the R program. The other six 
treatment groups received the R program together 
with one of the other three basic instructional pro- 
grams. The RE, RG, and RD groups received the 
R program followed by the E, G, and D programs, 
respectively, while the ER, GR, and DR groups 
received these same respective programs in the 
reverse order. 

The Ss were scheduled to come to the experi- 
mental room in groups of four or less and were 
arranged at the ends of two tables which were 
partitioned to provide separate study carrels. A 
brief quiz was used to screen out any Ss who were 
already familiar with number series and/or formu- 
las for summing them. Then, they were told: 


This is an experiment in learning mathematics. 
You will be given two programmed booklets to 
study.. You are expected to try to learn. You 
should work at a good pace, but read everything 
for understanding. ... If you have an error, don’t 
change your answer, but write the correct answer 
under your original answer. If you can not re- 
spond to a question within a minute or so, put 
an “X” in the blank and continue. You should, 
of course, look back at the question after finding 
the answer to be sure you understand... 


The Ss worked at their own rate. The two Es ob- 
served the progress closely, provided general as- 
sistance’ and encouragement where needed, and 
recorded the times taken on the introductory and 
treatment booklets. 

As soon as all of the Ss in the testing group had 
completed the treatment programs, they were told 
to review for a test. After 2 minutes, the booklets 
were collected and the tests and hint cards were 
presented. The Ss were instructed: 


On this test you will be timed. You also will 
be provided with hints to aid you when neces- 
sary. The less time it takes you and the fewer 
hints you need on a given problem, the better 
your score, You will be asked to find the formula 
for four new problems on this test. On each prob- 
lem, you will have 5 minutes to find the correct 
summing formula. You should show any neces- 
sary work in your booklet. When you get an 
answer, raise your right hand immediately. Like 
this! 'Try:it!... Ill tell you whether you are 
correct or incorrect, If incorrect, continue search- 
ing for the answer. Be sure to show me your 
answer quickly so that you get the best possible 
time score.... When I tell you that the 5 
minutes are up, if you have not found the 
formula, you may begin using hints. You may 
use as many of the hints as you wish, and when 
you wish, after the 5 minute period. But remem- 
ber, the fewer hints you use, the better your 
score, 
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Before continuing on to the second problem, each 
§ read all of the hint cards pertaining to the first 
problem. The four Ss in each testing group began 
each problem at the same time. If an S solvedia 
problem before the others, he was allowed to read 
the rest of the hints for that problem and, then, 
was required to wait for the others to finish. Before 
being released, Ss were asked not to discuss par- 
ticulars of the experiment with others who might 
participate. 

Three indexes of performance on the transfer 
tasks were obtained: (a) time to solution, (b) 
number of hints prior to solution, and (c) a 
weighted score similar to that used by Gagné and 
Brown (1961). The weighted score was equal to 
the time to solution in minutes plus a penalty of 
4, 7, 9, or 10 depending on whether S used 1, 2, 
3, or 4 hints, respectively. Theoretically, a range of 
scores from 0 to 20 was possible on this measure. 
Standard analysis of variance procedures were used 
to analyze the data after Cochran’s C test failed to 
detect heterogeneity of variance. 


RESULTS 


Treatment Programs 


All treatment groups performed at essen- 
tially the same level on the introductory 
program, both in terms of time to comple- 
tion (F = 1.74, df = 6/98, p > .05) and 
number of errors (F = 1.35, df = 6/98, 
p > .05). Since the number of frames var- 
ied among the treatment programs, no 
overall comparisons were warranted. 


Performance on Learning and Transfer 
Tests 


The results on the within-scope transfer 
test conformed to prediction. Irrespec- 
tive of the transfer measure used, the 
group (R) given the formula program only 
and the group (RD) given the formula 
program followed by the opportunity-to- 
discover program performed at one level 
(F < 1, df = 1/28) while the other five 
groups performed at a common (Ff < 1, 
df = 4/70) and significantly higher level 
(Fume = 32.66, df = 1/98, p< be 
Prints = 54.52, df = 1/98, p < .00h 
Frreigntea = 57.99, df = 1/98, p < 001). 
In particular, only that sequence effect a 
volving Groups RD and DR was signi: 
cant (p < .01). A 

While there were no overall treatme? 
differences on the extra-scope transfer be 
(maximum F = 1.31, df = 6/98, p 7” 
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TABLE 1 j 
Summary oF Means AnD STanparp Deyrarions OF Time AnD Errors on InrRODUCTORY AND ~ 
TREATMENT ProGraMs; IrEMs CorREcT on THE Tes? oF CrrreRion LEARNING; AND TIME 
Hints Anp WerauTep Scorzs on WirTatn- anp Exrra-score TRansrur ITEMS , 


Condition 
Program or test R RD DR RG GR RE ER 
x SD). | Sp} X [SD Vek SD X | sp} & | sp), xX | sp. 

Introductory ~ 

e 93 | 2.20] 9.00] 2.71] 7.13 | 1.50] 7.73 | 2.88] 8.73 | 2.93] 7.97 

Es . i 73 | 2. 27 | 1,00] 8.18 | 2.00 
natn 1.20 | 183] 80 | 1204] “leo | :80] 60} “t61 | 260) :71)  <47.| “188 }- °S60:! 1/02 
ime 23.07 | 7.42 | 29.83 | 8.70 | 33.00 | 7.70} 36.40 | 10.29 | 41.47 | 8.20] 95.87 | 8.70] 40.20 | 8.60 
a at 1:47 | 1:86} “1:20 | 104] 1:58 | 2122} “2.80 | 9.51 | 2.67 | 2:87] 1.78 | 1.65] 2.20 | 1.48 
mites 5.53 | 1.09] 5.47 | 1.06] 5.47 | 1.15] 5.67 | 1.01] 5.53] .81} 6.00] 0,00] 8.73) .68 
ime 6.48 | 1.87] 5.98 | 1.96] 4.36 | 1.78] 4.00] 1.37] 421 | 2, i : 
Hints oie a) Soe ae eT foo | hay | coo | cee] cee | ee | abo | 08 
sighted 12:87 | 3:56 | 11:65 | 3.30] 6.97 | 4.31] 5:73 | 3.40] 6.27 | 4:68] 6.20 | 3.82] 5.87 | 8.91 
Time 6.82 | 1.69] 5.91 | 1.48] 5.51] 4.71] 5.70] 1.20] 5.51 | 2.07] 5.88 1.95] 5.76 |. 
Hints 1.84 } 1500] 1.60} :88| 1.20] -85| 1:24] 173] 1.87 | 1 | 144] 73 fo | 
eg 12:22 | 4209 | 11201 | 3:77] 9:60 | 3:83} 9:63 | 3.12 | 9.71 | 4:87 | 10.14] 2:93 | 10.08 | 8.10 


Note.—Abbreviated: R = rule given, D = discovery, G = guided discovery, E = exposition of derivation rule. 


the contrast between Groups R and RD 
and Groups DR, RG, GR, RE, and ER at- 
tained a borderline significance level 
(Fume = 3.66, df = 1/98, .05 < p < 10; 
Paints = 4.02, df = 1/98, p < .05; Fweigntea 
= 4.61, df = 1/98, p < .05).3 There were, 
however, no reliable performance differ- 
ee between Groups DR and RD (F < 


"In a study on rule generality, Scandura, Wood- 
ward, and Lee (1967) obtained a similar extra- 
Scope transfer effect. While no extra-scope transfer 
was almost. universally the case, one of the rules 
(ie., 50 X 50) introduced was apparently general- 
a (ton x n) and thereby provided an adequate 
ir for solving an extra-scope problem. A recent 
a udy by Scandura and Durnin (1968) has demon- 
F tated that, indeed, the form of a rule statement 
San important determiner of generalization. 
ine not sufficient as presented, the derivation 
tie Statement, introduced in this study, could 
0 be generalized. In particular, the first hint 
Vailable on the third transfer problem provided a 
ne for making appropriate modifications. Simi- 
ate although Problem 4 inyolyed fractional term 
ues, the summing formula could be obtained 
ic celatively simple extension of the derivation 
© statement. 
ae or these reasons and because the results on the 
cffeon ore. test were subject to possible transfer 
fi eaiet testing on the within-scope test, caution 
my vised in interpreting the extra-scope results. 
obteg Sore test was originally included to 
Mon experimental hypotheses and not definitive 
oi penabione These comments, however, in no way 
Ply to the clear results on the within-scope test. 


These transfer effects cannot be at- 
tributed to differences in original learning. 
A learning test embedded within the com- 
mon R program, indicated that Ss had 
well learned the appropriate summing 
formulas to the three training series be- 
fore they took the transfer tests. The 
group means ranged from 5.5 to 6 with a 
possible maximum of 6 and, minimum of 
0. The error rates on the treatment pro- 
grams were similarly low with an aver- 
age of between one and two errors per 
program. 

Dicussion AND IMPLICATIONS 


Two points need to be emphasized. 
First, “what is learned” during guided 
discovery can at least sometimes be identi- 
fied and taught by exposition—with equiv- 
alent results. While this conclusion may 
appear somewhat surprising at first glance, 
further reflection indicates that we have 
always known it to be at least partially 
true. As has been documented in the lab- 
oratory (e.g., Kersh, 1958) as well as by 
jnnumerable classroom teachers of mathe- 
matics, it is equally as possible to give 
Ss rules for deriving answers as it is to 
have them derive (ie., discover) the an- 
swers themselves. No one to our knowl- 
edge, however, had ever seriously con- 
sidered identifying “what is learned” in 
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deriving rules (i.e., formulas) in addition 
to the rules themselves. In the present 
study, the authors were apparently success- 
ful in identifying a (derivation) rule for 
deriving a class of more specific rules. No 
differences in the ability to derive new 
(within-scope) formulas could be detected 
between those Ss who discovered a deriva- 
tion rule and those who were explicitly 
given one. What was not done in this 
study was to consider the possibility that 
the discovery Ss may have acquired a 
still higher order ability—namely, an abil- 
ity to derive derivation rules. In any case, 
there are undoubtedly a large number of 
situations where, because of the complexity 
of the situation, “what is learned” by dis- 
covery may be difficult, if not impossible, 
to identify, In these situations, there may 
be no real alternative to learning by dis- 
covery. 

Nonetheless, the value to transfer abil- 
ity of learning by discovery does not ap- 
pear to exceed the value of learning by 
some forms of exposition, Before definitive 
predictions can be made, careful considera- 
tion must be given to “what is learned,” 
the nature of the transfer items, and the 
relationships between them. As we identify 
just what it is that is learned by dis- 
covery in a greater variety of situations, 
we shall be in an increasingly better posi- 
tion to impart that same knowledge by 
exposition. 

The second point to be emphasized con- 
cerns the sequence effect—if a person al- 
ready knows the desired responses, then he 
is not likely to discover another rule by 
which such responses may be derived, even 
if he has all of the prerequisites and is 
Given an opportunity to do so. The re- 
verse order of presentation may enhance 
discovery without. making it more difficult 
to learn more specific rules at a later time. 
In short, prior knowledge may actually 
interfere in a very substantial way with 
later opportunities for discovery. Nonethe- 
less, there may be some advantages in- 
herent in learning more specific rules. Al- 
though data are practically nonexistent on 
this point, it is quite possible that specific 
rules may result in shorter latencies. 
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Why and how sequence affects “what is 
learned” is still open to speculation (eg, 
Gutherie, 1967; Yonge, 1966). Our interpre- 
tation is as follows: When S is presented 
with one or more stimuli and is required 
to produce responses (e.g., formulas or 
specific rules) he does not already know, 
he necessarily must first turn his attention 
to deriving a rule (or derivation rule) by 
which he can generate the appropriate re- 
sponses. In the process, S may discover a 
derivation rule, which is adequate for de- 
riving other responses in addition to the 
ones needed. The kind and amount of 
guidance given would presumably help to 
determine the precise nature of the deriva- 
tion rule so acquired. On the other hand, if 
S already knows the responses (i¢., has 
previously mastered more specific rules or 
“associations” by which the responses can 
be derived), it is not likely that he will 
waste much time trying to find another 
way to derive them. Under these conditions, 
it would seem that the only way to get § 
to learn a more general rule would be to 
change the context. Presumably, the ex- 
pository and guided discovery Ss in this 
study learned the derivation rule because 
this appeared to be the desirable thing to 
do. The authors believe that any theory 
based on the rule construct will have to in- 
voke some such mechanism to account for 
sequence effects (e.g., see Scandura, 1968 
a, b, and c). 

The obtained sequencing result may also 
have important practical implications, a8 
will be attested to by any junior high 
school mathematics teacher who has at- 
tempted to teach the “meaning” under- 
lying the various computational algorithms 
after the children have already learned to 
compute. The children must effectively sav 
to themselves something like, “I she 
know how to get the answer. Why should. 
care why the procedure works?” This i 
not to say that meaning should be taug 
first simply out of some sort of dislike for 
rote learning—for certain purposes 10 4 
learning may be quite adequate and # : 
most efficient procedure to follow. The im 
portant point is that learning such on 
as how to multiply, without knowing W 
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multiplication means, may actually make 
it more difficult to learn the underlying 
meaning later on. 
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Conditions of massed and distributed practice were studied using a 
within-Ss design in a situation involving computerized spelling drills. 
In the distributed condition, 2 sets of 8 words each were presented 
once every other day over a period of 6 days. The learning trials 
on 6 other sets of words were massed so that all of the trials for that 


set occurred on the same day. Ss were 29 5th graders. The probability 
of a correct response for words in the massed condition was higher 
than that for the distributed condition during the learning sessions, 
but on retention tests (given 10 and 20 days later) the words 
learned under distributed practice were better remembered. A mathe- 
matical model of the learning process is presented and shown to 
provide a fairly adequate account of the experimental data. 


Computer-assisted instruction (CAI) re- 
fers to an instructional procedure which 
utilizes a computer to control part, or all, 
of the selection, sequencing, and evaluation 
of instructional materials. Over the last 4 
years, the Institute for Mathematical Stud- 
ies in the Social Sciences at Stanford Uni- 
versity has been developing a CAI system 
for regular classroom usage (Atkinson, 
1967). One mode of this development is re- 
ferred to by Suppes (1966) as the “drill 
and practice systems.” These systems are 
intended to supplement the instruction 
which occurs in the classroom. They are 
designed to improve—through practice 
—the skills and concepts which are intro- 
duced by the classroom teacher. 

Currently, computer controlled drills are 
being given to approximately 1,800 students 
in six schools in five different communities, 
Some of the students have been receiving 
daily drills in arithmetic (Suppes, Jerman, 
& Groen, 1966) while others have been re- 
ceiving drills in spelling. This study made 
use of the equipment and students in the 
school which has been involved in drill and 
practice in spelling, 

In the study to be reported here, the 
presentation routine for each spelling word 
was the same: An audio system presented 


* This research was supported by an Office of 
Education contract, No. OE5-10-050, and by a 
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Health Service, Contract No. MH 06154, 

? Now at the University of California, Irvine. 


the words, the student typed the word, | 
and the computer evaluated the student's | 
answer. If the response was correct, the 
computer typed “...C...”; if incorrect, 
“...X...”, followed by the correct spelling 
of the word. If the response was not given 
within a predetermined length of time, the 
message “,..TU...”, meaning “time is up,” 
was printed. A flow chart summarizing 
this procedure is given in Figure 1. 

These CAI drill and practice systems 
lend themselves nicely to the study of 
many experimental variables. One persist- 
ent problem in designing instructional 
systems is the specification of optimal pro- 
cedures for presenting material. Indeed, 
the spacing of learning sessions has already 
received considerable experimental investi- | 
gation, yet the question of optimal spacing 
has not been resolved. For example, assume | 
that we have 6 days in which to teach a list 
of 24 spelling words, and that each daily 
session is arranged so that 24 presentations 
can be made, What practice schedule 
would produce the best results? One 
might select a different set of four wor 
each day and on that day present eae 
word six times. At the other extreme, 0° 
could present each of the 24 words ones 
per day. In both schemes a given ee 
would be presented for study on six eee 
ent occasions, but in one condition all e: 
the repetitions for a given word would 0 ; 
cur on 1 day whereas in the other ae 
they would be distributed over 6 days. 
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Massep vs. Distereuren Practice 1 Compurerizen Dritis 


©) 


‘SYSTEM PREPARES 
"TO PRESENT 
FIRST WORO 


| 


AUDIO 
PRESENTATION 
OF WORD 


SPELLING ) 


PRINT 

SPSS en 

(AND CORRECT 
SPELLING) 


Fig. 1. Flow chart for presentation routine. 


The two extremes could be called, respec- 
tively, massed and distributed practice, 
although this terminology is somewhat at 
variance with the classical usage of these 
terms. The preponderance of experimental 
evidence indicates that, for the same 
amount of practice, learning is better when 
Practice is distributed rather than massed, 
although there are exceptions to the gen- 
cralization, The purpose of the present 
study is to investigate this problem further 
and to evaluate optimum procedures for 
distributing instructional material in com- 
buter-based spelling drills. 


Meruop 
Subjects 


in The Ss were 29 students from a fifth-grade class 

a “a East Palo Alto school. Approximately 507% 

ha aie students scored below grade level on 

he lardized reading tests; 20% were reading at 
© second and third grade level. 


The Computer System and Terminals 


ae computer which controlled the student 
ie was a modified PDP-1 digital computer 
Hine at Stanford University. It was a time 
differcy Computer capable of handling over 30 
ianaey users simultancously from a variety of 
devices. The audio system for the spelling 
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drills was controlled by a Westinghouse P-50 com- 
puet beens in turn, was linked to the PDP-1. 
four student terminals were located at an 
East Palo Alto school in a converted storeroom a 
short distance from the child’s classroom. Each 
terminal consisted of a standard teletype machine 
and a set of earphones; both were linked to the 
computer at Stanford by telephone lines. 

All four terminals were controlled by a single 
program on the PDP-1; each student user was 
serviced sequentially in a round-robin cycle. Due 
to the extremely rapid speed of the computer, the 
student teceived the impression that he was get- 
ting “full-time” service, although actually the 
computer was devoting only a small fraction of its 
running time to any one individual. 


Daily Operation 


A full-time monitor was on duty whenever the 
children were using the teletypes. Her presence was 
primarily a precautionary measure so that an 
adult would be available in case of an emergency. 
The actual check-in, presentation and evaluation 
of the drill, and the sign-out were all handled by 
the CAI system and occurred as follows. 

The student entered the room, sat down at a 
free terminal, and put on his earphones. The ma- 
chine printed out, “Please type your number.” 
(This whole routine had been explained to the 
students during a 2-week orientatioin session.) 
After the student typed in his identification num- 
ber and depressed the space bar—the latter opera- 
tion was used as a termination signal for all stu- 
dent responses—the computer printed the student’s 
name and the program was set. in operation. The 
message, “If you hear the audio, please type an 
‘9? and a space,” was then heard over the ear- 
phones. If the instructions were followed, the 
lesson began and each word was presented accord- 
ing to the sequence given in Figure 1, 

The audio system presented a word, used the 
word in a sentence, and then repeated the word 
again. As soon as the audio was through, the ma- 
chine typed a dash (—). This was the student's 
signal to begin his response. When he finished 
typing his answer, he depressed the space bar, and 
the computer evaluated the answer. A correct re- 
sponse was followed by the typed “message, 
“« _.C...”, An incorrect response was indicated by 
the message, “...X.-.,” followed by several spaces 
and a correct spelling of the word, If a response 
was not given in 40 seconds, the message, 
« TU...” was printed. ‘As on an incorrect an- 
swer, this message was followed by several spaces 


i onse the stu 
ae ‘the correct answer before the next item was 
presented. Each time a new sie was presented, 

ious items were covered. 
pie the Maite sessions of this study, a “list” 
consisted of 12 such presentations; in the test ses- 
sions, 24 presentations. When the entire list had 
been presented, the machine printed out the fol- 
lowing information for the student: his list num- 
ber for the next session, the date and ending time, 
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and the number of words he spelled correctly on 
the day’s session. The drills were collected by the 
monitor and at no time was the student given a 
copy of the words to study on his own. 


Words 


The words used in the experiment were taken 
from the New Iowa Spelling Scale (Greene, 1954). 
This scale is the product of the testing of some 
238,000 pupils throughout the country in the early 
1950s to determine the percentage of students that 
‘could spell a word correctly at each grade level. A 
list of the actual words used in the experiment can 
‘be found elsewhere (Fishman, 1967). 


Experimental Design 


The experiment involved a within-Ss design, 
‘(ie., each S participated in all conditions). The two 
‘main conditions were those of massed (M) and 
distributed (D) practice. There were eight sets of 
words: six of them were massed, designated M:, 
M:, Ms, Ma, Ms, and Me; and two were dis- 
tributed, designated D: and Ds. Each of these 
eight sets contained three words. Thus a total of 
8 X 3 = 24 words were used in the experiment for 
a given S, Training sessions ran for 6 consecutive 
days, Each session used one of the M sets and one 
of the D sets. The M words were presented three 
times within a session, whereas the D words were 
presented once. Thus, there were 3 X 3 = 9 pres- 
entations of M items plus 3 presentations of D 
items yielding a total of 12 presentations in any 
one session. Words from a different M set were 
presented in each session and all the learning trials 
for the set occurred on the same day. Words from 
a given D set were presented on alternating days. 
Table 1 summarizes the daily presentations. 

The arrangement of the list for the first train- 
ing session (Day 1) illustrates the procedure used 
for the entire training sequence. The first four 
items of the day’s list consisted of the three words 
in M; plus a randomly chosen word from D;. The 
second four items consisted of the three M, words 
plus a second randomly chosen D; word. The last 
four items consisted of all three M; words plus 
the remaining word from D;. In other words, the 
12 presentations to an S on any day were given in 
three blocks with four words in a block, Each block 
contained all three M words and a randomly 
chosen D word. The order of the words within a 
block was randomly determined. Further, the as- 
signment of words to M and D sets was completely 
counterbalanced over Ss, so that every word ap- 
peared equally often in the various M and D con- 
ditions. 

Tests were administered 10 and 20 days after 
the end of the training sequence, The students did 
not receive any computerized drill between the 
training and test days. The basic test procedure 
consisted of presenting the complete list of 24 
words. The order of the words for each S was ran- 
domly determined, and each word was presented 
once using the procedure of Figure 1. As during 
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TABLE 1 


Summary or THE Worp Sers Usep purine Tap 
Srx Trarnine Szsstons 


Condition 1 2 3 4 5 6 
Massed (M) Mi: | Ma | My | My | Ms | My 
Distributed 

(D) D: | Ds | Di | Ds | Dy | Ds 


the training sessions, the student was told whether 
or not his response was correct, and was then given 
6 seconds to study the correct answer before the 
next item was presented. 


RESULTS 


Figure 2 presents the proportion of cor- 
rect responses over successive presentations 
of M and D items. For example, on Day 
1, the M, items were each presented three 
times; the proportions correct for each of 
the three presentations were averaged over 
Ss and plotted successively above Training 
Session 1. The D, items were each pre- 
sented once; the mean proportion correct 
for these items is also plotted above Train- 
ing Session 1. This was done for the data 
from each of the six training sessions. Ap- 
proximately. 2 minutes elapsed between 
two presentations of a massed item, 
whereas 2 days elapsed between any two 
presentations of a distributed item. 

The tests were given on Days 16 and 
26. The test results are also presented in 
Figure 2. The six massed curves are siml- 
lar in form; they all rise sharply, then 
drop off by the time of the administration 
of the first test. In contrast, the two dis- 
tributed curves rise more gradually but do 
not show a drop off at the time of the first 
test. ; 

All items were presented three times dur- 
ing the training sequence and once on eae 
of the test days. Figure 3 gives the propor 
tion correct on each presentation average 
separately over M and D items. During ie 
training sequence, the proportion ie 
for the M items increased from about ; 4 
on the first presentation to .77 on the pa 
presentation, whereas the D items 00 
Tespondingly increased from about 25 : 
.57. The difference between the ak) 
proportion correct on the first presenle 
of M items and the first presentation © 
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TRAINING SESSIONS 
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TFT }oistreute 


ae 


6 I 2 
Aa 8S) 
TEST SESSIONS 


EXPERIMENTAL SESSIONS 


Fig. 2. Proportion of correct responses for massed and distributed items on both training 
and test trials, 


items was not significant at the .05 level 
using a paired ¢ test, ¢ = 1.58, df = 28. 
However, there is no reason to expect 
equality when it is noted that the data point 
for the mean of the massed first presenta- 
tions came from all six training sessions 
whereas the data point for the mean of the 
distributed first. presentations came from 
the first two training sessions. In contrast, 
as indicated in Figure 3, there were sig- 
nificantly more correct responses on the 
second and third presentations of the M 
items than on the corresponding presenta- 
tions of D items. 


PROPORTION CORRECT 


TRAINING SESSIONS Ter soe 


ITEM PRESENTATIONS 
. 8. Observed and predicted values for the 
massed and distributed conditions. 


A paired ¢ test on the combined data from 
the posttraining tests yielded t = 2.44, 
df = 28 which was significant at the .025 
level, indicating that distributed practice 
resulted in better performance than massed 
practice. 


Discussion 


The major results of this experiment were: 
(a) the massed condition was superior to the 
distributed condition on the second and 
third presentations of the training sequence 
and (6) the distributed condition was supe- 
rior on both of the test sessions. Thus, it 
appears that the massed repetitions are 
better if one looks at short-term perform- 
ance, but in the long run more learning 
occurs when repetitions of an item are well 
distributed. 

Tn this section, these data are analyzed 
in terms of a model that has been proposed 
to account for paired-associate learning. 
The model is a variation of the trial-de- 
pendent-forgetting model presented in recent 
articles by Atkinson and Crothers (1964) 
and Calfee and Atkinson (1965). The learn- 
ing of a list of spelling words can be said 
to resemble the learning of a list of paired- 
associate items; nO assumption is made that, 
the two tasks are identical, yet there are 
variables in paired-associate learning that 
clearly are relevant to the spelling task. 

Tn the model, S is assumed to be in one 
of three learning states with respect to a 
stimulus item: (a) state U is an unlearned 
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state, in which S responds at random from 
the set of response alternatives, (b) state S 
is a short-term memory state, and (c) state 
L is a long-term state. The S will always 
give a correct response to an item if it is in 
either state S or state L. However, it is 
possible for an item in state S to be for- 
gotten, that is, to return to state U, whereas 
once an item moves to state L it is learned 
in the sense that it will remain in state L for 
the remainder of the experiment. In this 
model, forgetting involves a return from 
the short-term memory state, S, to state 
U, and the probability of this return is 
postulated to be a function of the time 
interval between successive presentations 
of an item. 

More specifically, two types of events are 
assumed to transitions from one 
state to another: (a) the occurrence of a 
reinforcement, that is, the paired presenta- 
tion of the stimulus item together with the 
correct response, and (b) the occurrence of a 
time interval between successive presenta- 
tions of a particular item. The associative 
effect of a reinforcement is described by the 


following transition matrix: 
L 8s U 
L}1 0 0 
Sia l-a 0 
Ulbr (l—be 1-2 


‘Thus, if an item is in state U and the correct 
response is shown to S, then with probability 
(1 — 2) the item stays in state U, and with 
probability 2 the item moves into state § 
or L: If it moves, then with probability 6 
it moves into L and with probability (1 — b) 
pete econ uly, if an item is in state § 
and the correct response is sh then wi 
probability @ the item moves to state mn 
and with probability 1 — a the item stays 
in state S. Finally, if an item is in state L 
then it remains there with probability 1” 
The parameter x is assumed to vary as a 
function of the familiarity of the items in 
the list being studied. Thus, during the test 
sessions involving 24 famili items, z will 
be larger than during the initial ses- 
are 


sions involving 12 items, many of whi 
presented for the first time, pao 


From one presentation of an item to its 
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next presentation, a transition can occur 
as described by the following matrix; 


L 8 U. 
L}1 0 0 
S|0 Life ah 
U{[o0 0 1 


The parameter, f,, depends on the time 
interval between successive presentations 
of the same item. If a given item is in state 
S, a time interval ¢ between successive 
presentations may result in forgetting of 
the item (i.e., transition to state U) with 
probability f, . Otherwise there is no change | 
in state. For simplicity, we assume f, = 0 
for short time intervals within the range ofa — 
given training session. When the time inter- | 
val is a day or greater, then we assume 
f, = 1. In essence, no forgetting occurs from | 
the short-term state within a given training 
session, but from one day to the next no | 
information is retained in short-term store. 
Furthermore, the above transition matrices 
imply that L is an absorbing state; once an 
item enters state L it remains there. The 
model makes the additional assumption 
that at the start of the experiment an item 
is already known (state L) with probability 
p, or not known (state U) with probability 
1—p. 

For this model, the difference between 
the M and D items on the second and 
presentations is due to a difference in the 
probability that an item is in short-term 
memory (state S). The parameter @ charac- 
terizes the probability of going from state 
S to state L. This parameter can oper 
only for the massed items, since it 18 im- 
possible for a distributed item to be in state 
S when a reinforcement occurs. A distribu 
item could go into state S immediately after 
its presentation, but from one presentation 
to its next, it would have been forgotten. 
The probability of being correct on an item 
that is in state S is one; thus the 
curves should be higher for the-secon' 
third presentations. " the 

The assumption that f, = 1 when $ 
time interval is a day or longer, means e 
short-term memory has been wiped © 
completely by the time the first test 38 ever 
Thus, superiority of the D items over the ® 


d and 
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items in the test data indicates differences 
in the number of items in state L. This in 
turn implies that the parameter 6 must be 
larger than the parameter a. If b were smaller 
than a, one would expect the M condition 
to do better than the D condition during 
both the training and test sessions, whereas 
if b were equal to a, one would expect a 
difference during the training sessions in 
favor of the M condition, but none in the 
test sessions. 

Parameter estimates for the model were 
obtained by methods described in Atkinson 
and Crothers (1964). The values which 
yielded the best fit between observed and 
predicted proportions were: 


p = .28 
a=0 
b = 38 


zx (for training sessions) = .45 
a (for test sessions) = .74 


These estimates were consistent with the 
notion that 6 should be larger than a. The 
model proposed here is similar to Greeno’s 
(1964) model for paired-associate learning 
in which he explicitly requires the parameter 
a to be zero. The present findings for this 
more complex task indicate that his theory 
and related research on paired-associate 
learning are relevant to the effect of repeated 
presentations of spelling items. Figure 3 
presents the fit between the observed and 
predicted proportions using the above 
parameter estimates. Inspection of this 
figure indicates that the model gave an 
adequate account of the results of the experi- 
ment, 

To check the validity of these results, the 
same S’s were run 2 weeks later using pre- 
cisely the same procedure but with a new 
set of words. Figure 4 presents learning 
curves for this replication comparable to 
those presented in Figure 3. Application 
of the model to this data yielded the follow- 


ing set of parameter estimates: 
p= 32 
a=0 
b = 33 


x (for training sessions) = .60 
x (for test sessions) = .72 
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—— OBSERVED 
——— PREDICTED 
2 
EARNS A RRR 
wal 2 3 1 2 
TRAINING SESSIONS TEST SESSIONS 
ITEM PRESENTATIONS. 


Fic. 4. Observed and predicted values for the 
replication experiment. 


Once again, the estimate of a is zero con- 
firming our earlier result. Also, in general, 
performance is superior in the second experi- 
ment, suggesting that some form of learning- 
to-learn may be operating in this situation. 
The authors have not carried out analyses 
that bear on some of the more detailed 
features of the model. In fact, in view of the 
stimulus material used, it seems unlikely 
that these features would be verified. What 
clearly needs to be done is to generalize the 
paired-associate model to take account of 
the linguistic constraints imposed by the 
ing task. Some of the present results 
and those of Knutson (1967) suggest guide- 
lines for such a model but the authors are 
not prepared to be more specific at this 
time. Hopefully such a model would provide 
a more definitive answer to the problem of 
optimizing the instructional sequence in 
spelling drills. 
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EFFECTIVENESS OF FEEDBACK TO TEACHERS AS A 
FUNCTION OF SOURCE’ 
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Rutgers - The State University 


286 teachers were separated by years of teaching experience ant - 
jected to 1 of 4 conditions: (a) feedback from eeidents oe “b) 
from supervisors, i.c., viee-principals only, (c) from both students and 
supervisors, and (d) from neither (no feedback). It was found that 
student feedback led to a positive change among teachers (as 
measured by change in students’ ratings across a 12 wk. interval). 
Supervisor feedback added nothing to this effect when combined with 
student feedback, and when alone, produced change in a direction 
opposite to. the feedback as compared to the no-feedback condition. 
Less experienced teachers showed greater receptivity to student 
feedback than their more experienced counterparts while the reverse 
held true for receptivity to supervisor feedback. 


The problem of modifying the behavior 
of teachers is one that has been submitted 
to close scrutiny from a variety of vantage 
points. Techniques such as microteaching 
and the use of interaction process analysis 
have been employed, primarily with stu- 
dent teachers, as a means of altering their 
behavior. Underscoring the entire rationale 
for this approach, Daw and Gage (1967) 
recently said: 


It is highly plausible that feedback regarding how 

peat feel about one’s behavior will affect one’s 
ehavior, Whether this maxim will hold under a 

ee set of practical circumstances must, however, 
e determined empirically [p. 181]. 


This Study was an attempt to extend 
this “maxim” to conditions as yeb un- 
tested. 

Bryan (1968) has shown that teachers 
will alter their behavior as the outcome of 
cng feedback from their students. 
ae purpose of this study was to replicate 

ryan’s basic finding, using his instrument, 
and then to extend this finding. by de- 
ene the relative effects of feedback 

tom students and from supervisors (i.e. 
administrators responsible for instruction) 
on teachers’ behavior. Moreover, Bryan's 
study did not include control over the vari- 
pe of amount of teaching experience of 
eachers whose behavior was to be changed. 
‘is experimental and control groups 
ae study was, in part, the doctoral disserta- 
Basic ve junior author. It was supported, Be 

fice f rant No. 6-8327 from the United States 

of Education. 


showed an imbalance on this variable at 
the conclusion of his experiment with the 
preponderance of less experienced teachers 
appearing in the experimental group. An 
additional purpose of the present study was 
to systematically introduce years of teach- 
ing experience as an experimental variable 
so that its effects, if any, could be deter- 
mined. 

Finally, the present study was carried 
out with vocational teachers, in order to 
demonstrate additional generalizability for 
the basic finding obtained by Bryan using 
primarily teachers of academic subjects. 

The fact that teachers change as the re- 
sult of student feedback has also been 
demonstrated by Gage, Runkel, and Chat- 
terjee (1960). Their study also showed 
that amount of change was related to the 
interval between pretest and posttest. Daw 
and Gage (1967) have shown, furthermore, 
that feedback from teachers can be used to 
alter the behavior of principals, but that 
the amount of change is not a function of 
the pretest-posttest interval. iby 

In this study, as in previous studies in 
this area, the measurement of change in 
teacher behavior was inferential. Students 
were asked to rate their teacher twice, 
with a 12-week interval separating these 
ratings (during which time the treatments 
could take effect). Behavior change by 
teachers was inferred from a difference be- 
tween postinterval and preinterval rat- 
ings. Remmers (1963) has shown that stu- 
dents, as & measuring instrument, are as 
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reliable as the best mental and educational 
paper-and-pencil tests and can discrim- 
inate between aspects of teacher behavior 
(see also Tuckman, 1967). Thus, the de- 
pendent variable was identified as change 
in teachers’ behavior with the recognition 
that this was inferential. 

The expectation that years of teaching 
experience would be a significant variable 
was based on studies such as that of 
Ryans (1964) and Peterson (1964) who 
have shown that teachers’ behavioral pat- 
terns change in a systematic fashion as a 
function of age. While age and years of 
teaching experience are not the same vari- 
able, they are assuredly related, with the 
latter being perhaps the more conceptually 
meaningful in an educational context. 


PROBLEM 


To determine the relative effects of 
students and supervisors as feedback 
sources for teachers, four conditions were 
run. In the first condition student feedback 
alone was employed; in the second, super- 
visor feedback was employed alone (the 
supervisor being an administrator, usually 
a principal or vice-principal responsible for 
the teaching activities of teachers) ; in the 
third, both feedback sources were employed 
concomitantly; and in the fourth, no feed- 
back was given. Teachers were further 
classified as to teaching experience and 
systematically assigned to conditions on 
that basis. 

It was hypothesized that: (a) teachers 
receiving feedback would change more than 
teachers not receiving feedback (essentially 
a replication of Bryan’s results); (b) 
amount of change in teachers’ behavior 
would vary as a function of feedback 
source; (c) years of teaching experience 
and amount of change would be inversely 
related. ; 


Merson 


Sample 


The sample consisted of 286 teachers of voca- 
tional subjects at the high school or technical in- 
stitute level. Schools were selected from New 
Jersey and surrounding out-of-state counties and 
virtually all the vocational teachers in the schools 
used took part in the study. Participating teach- 
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ers had a median class size of 15 students who wer | 
either in the tenth, eleventh, or thirteenth grade, 


Measurement of Teacher Behavior 


Teacher behavior was measured by the Sty. 
dent-Opinion Questionnaire (SOQ) developed by 
Bryan (1963). This instrument includes 10 rating 
scales on which teacher is judged as to his (a) 
knowledge of his subject, (b) ability to explain, | 
(c) fairness, (d) ability to maintain discipline, 
(e) degree of sympathetic understanding, (f) abil- 
ity to make you learn, (g) ability to be interesting, 
(h) ability to get things done efficiently, (7) ability 
to get students to think for themselves, and (j) 
general all-round teaching ability. Each scale has 
five points labeled: below average, average, good, 
very good, and the very best. 

Bryan (1963) has reported reliability coefficients 
for the 10 items on the SOQ of from .75 to 85 for 
chance-half averages for 50 classes. For whole 
classes of 28 students on the average, coefficients 
of from .86 to .92 were obtained. 

On the reverse side of the SOQ are four open- 
ended questions dealing with the course and 
teacher, reflecting on things that are liked about 
each and suggestions for the improvement of each. 


Feedback Conditions 


Students only. Students completed the 80Q, 
and their ratings on the 10 scales were averaged, 
The teacher was presented with a graph showing 
the average student judgment for each item. In 
addition, a summation of the students’ responses 
on the open-ended questions were provided. 
Teachers were told that the feedback was from 
their students. , 

Supervisor only. The teacher's supervisor 
(either the principal, vice-principal, or as 
principal) completed the SOQ, and his ratings 00 
each item were given to the teacher in graphic! 
form along with a summary of his answers to 
open-ended questions. The teacher was told tha 
this rating was made by the supervisor. (In this 
condition, student ratings were also obtained an 
though these were not made available to the | 
teacher.) i a 

Students and supervisor. The teacher's sre 
visor and students completed the SOQ, and fee 
back from each was given separately, along bis 
identification of source in the same manner a8 
the first two conditions. 09, 

No feedback. Students completed the D 
but no feedback was provided to the teacher. 

All initial testing was done in the late fall. 


Years of Teaching Experience 
rsonal infor 


Based on information from a pe having 
mation form, teachers were categorized as D8 

1-3 years of teaching experience, 4-10 
teaching experience, or 11 or more 
ing experience. Teachers from each gt 
then randomly assigned to each condi 


EFFECTIVENESS OF FEEDBACK AS A Function or Source 


TABLE 1 
Dssien oF THE EXPERIMENT: ASSIGNMENT OF 
TreacHERS TO TREATMENT AND 
EXPERIENCE GROUPS 


‘Years of experience of instructor 


iti 1-3 years | 4-10 i 
Condition a6) Tia or ary 
B: | Ba | Bi | Bs | Bi | Bs 
No student feedback 


14 | 18 | 19 | 18 | 18 | 18 


(Ci) 
Student feedback (C2)| 39 | 32 | 25 | 31 | 32 | 27 


Note.—Cell entries are number of observations 
per cell; N = 286; Abbreviated: B: = no super- 
visory feedback, By = supervisory feedback. 


overall design of the study and assignment of 
teachers to conditions is shown in Table 1. 


Measurement of Change in Teacher’s Be- 
havior 


In the late spring, following a 12-week interval 
after the initial testing, students of each of the 
teachers in the study completed the SOQ. The 
Measure of change in each condition was the sum 
of the differences between the preinterval judg- 
ments by the students on the 10 items and their 
postinterval judgments. Ratings on each item 
Were averaged across students and the preinter- 
val average on cach item was then subtracted 
from the postinterval average to yield a change 
Score on each of the 10 items. These 10-item 
change scores were summed to obtain a total 
change score. Student judgments were used 
throughout as a measure of change to maintain 
a constant measuring instrument across condi- 
tions. This was seen as justifiable since preinter- 
val ratings by students did not differ significantly 
a those of supervisors in conditions where 
ae were obtained and the latter were used as 
he feedback source. 
an test administration was accomplished by 

@ local vocational guidance counselor. 


Analysis 


« For purposes of analysis, the four feedback 
Onditions (Conditions 1-4) were treated as two 
ae supervisor feedback and student feed- 
= Bu with two levels on each: present and ab- 
fol The four conditions were thus labeled 2s 
(qlows: (bie:) student and supervisor feedback, 
i, supervisor. feedback only, (bei) student 
“aback only, and (byes) no feedback (see Table 
tor ~cats.of teaching experience was the first fac- 

T and had three levels, Subsequently, a 3 * 2X 
on tlysis of variance using the unweighted means 
wi lution for iinequal cell entries (Winer, 1962) 
88 carried out on the total change score for each 
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teacher. (Each teacher was used only once in the 
design.) In addition, direct mean comparisons 
were made using the Duncan multiple-range test 
(Dunean, 1955)? 


RESULTS 


The results of the analysis of variance 
for the total change score showed that the 
presence of student feedback (Factor C) 
had a significant effect on teachers’ be- 
havior as compared to its absence (F = 
5.941; df = 1/274, p < .025) while the 
presence of supervisor feedback (Factor 
B) produced no significant effect (F = 
1.064; df = 1/274). The years-of-experi- 
ence variable (Factor A) also failed to 
produce a significant effect (F = 0.701; 
df = 2/274) and none of the interactions 
achieved significance at the .05 level (F < 
1 in each case). 

In an effort to delineate further the 
feedback effects, means for the four feed- 
back conditions were compared, as shown 
in Table 2. From the table it can be seen 
that both conditions involving student 
feedback showed significantly greater 
change than both conditions not involving 
student feedback.? Feedback from students 


TABLE 2 
Mean Toran Cuanen Scores BY FrrpBack 
Conpirron AND THEIR COMPARISON BY Duncan 
Mo.ririu-Rancp Tust 


No 
se fringed] Seagie | ae 
—.054 — .385 —2,449* —1,234* 


* Significantly different from all other means, 
= Se Geith exception of difference between 


second and fourth means, where p < .05). 


2A fifth condition, called the posttest-only 
control group by Daw and Gage (1967), was also 
run with an additional 15 ape sr oe 
ir students only at the end o 

piriaensate trot this condition was to 
+ or preinterval 


da sensitizing effect on the raters 


to the mean on 

peal a for the no-feedback group showed them 
to be comparable. ‘Thus, 
test sensitization was not @ sour 
* Throughout this description, 
ferred to as changing “more” or 


source of invalidity. 
results are Te- 
“Jess.” However, 
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alone and from students and supervisors 
combined were statistically comparable, 
indicating a failure for feedback from su- 
pervisors to generate any change beyond 
that accounted for by student feedback 
alone. Finally, feedback from supervisors 
alone produced a significantly greater nega- 
tive shift (ie., a change in the opposite 
direction of that recommended by the 
feedback) than no feedback at all. 

Thus, student feedback “improved” 
teacher behavior as compared to no feed- 
back. Supervisor feedback produced no ad- 
ditional effect when combined with student 
feedback, and an adverse effect when used 
alone, 


Discussion 


The first hypothesis of this study pre- 
dicted that feedback (source unspecified) 
would yield a greater positive change than 
no feedback, while the second hypothesis 
predicted different effects for the different 
feedback sources. The surprising finding of 
this study was that teachers receiving 
feedback from supervisors changed more in 
the opposite direction from the feedback 
than the spontaneous shift obtained in the 
no-feedback condition. Thus, the first hy- 
pothesis holds true for student feedback (a 
replication of Bryan’s findings) which led 
to effects in excess of the no-feedback 
condition. Supervisory feedback added 
nothing to the student feedback effect 
when they were combined. (If anything it 
reduced it, but not significantly so.) Since 
supervisory feedback had the opposite ef- 
fect than predicted, the second hypothesis 


in the light of the fact that almost all of the 
means are negative, changing more means showing 
a lesser negative shift (ie. a smaller negative 
change score) while changing less means showing 
a greater negative shift (ie., a larger negative 
change score), This tendency for ratings to be 
less positive following the interval as compared 
to those preceding the interval were not attributa- 
ble to a testing effect (see the preceding footnote). 
One must conclude that students as raters are 
more negatively inclined toward their teachers in 
the spring (after experiencing them for a year) 
than in the fall. Thus, the positive effect of feed- 
back, when it occurred, was to reduce this ten- 
dency toward greater negativity of ratings (ie, 
make the negative score smaller or positive). 
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was confirmed—that is, the feedback 
sources did have different effects. If in the 
first hypothesis, it was simply predicted 
that feedback would produce greater 
changes than no feedback, it would have 
been confirmed. Certainly this experiment | 
suggests that teachers react to feedback, 
irrespective of source, with these reactions 
being positive only in the case of student | 
feedback. 

The question of why teachers reacted to 
feedback from supervisors as they did is 
immediately raised. It can only be sur- 
mised that teachers are defensive toward 
(or even hostile to) administrators who, in | 
the absence of much basis for judgment, 
attempt to tell them how to teach. Of in- 
terest, though, is the fact that within the 
educational milieu, the only source of feed- 
back to teachers, typically, are their su- 
pervisors. The data collected here indicate 
that such feedback is doing more harm 
than good, with the “best” source of feed- 
back, students, overlooked. | 

The third hypothesis of the present 
study predicted an inverse relation be- 
tween years of experience and receptivity 
to feedback. While the obtained relation- 
ship was not sufficiently strong to prove 
significant, the most experienced teacher 
group tended to show the least receptivity 
to feedback from their students, as the hy- 
pothesis predicted. However, the least ex- 
perienced teacher group tended to show 
the least receptivity (i.e., the least rela- 
tively positive shift) to feedback from 
their supervisor—the reverse of the hy- 
pothesis. aye 

Finally, a last question must be raise. 
Why do all the change scores tend to be 
negative with positive change being meas 
ured in terms of the “smallness” of vt 
negative score? The use of a ent 
teachers whose students made only oo 
postinterval ratings indicated that ial 
test-retest phenomenon was not respons! 
ble for this shift from pre- to postratit | 
It appeared that students are more fens 
cal of their teachers at the end of sai 
term than at the middle. At the time W e 
the teacher is about to evaluate and gt a 
the student, the student perhaps replies 
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kind. Thus, a positive change appeared as 
a lessening in the “naturally” occurring 
negative shift. Researchers interested in 
using student judgments are cautioned to 
use the same starting and ending times for 
all groups to avoid the confusion of this 
end-of-term effect. September to January 
will not lead to the same effect as February 


to June. 
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EFFECT OF SOCIAL PRESSURE ON 
CONCEPT IDENTIFICATION: 


VERNON L. ALLEN anp BARRY W. BRAGG 
University of Wisconsin 


This study investigated the effect of veridical and nonveridical group 
feedback on concept identification, and the transfer of the effect of 
such social pressure from 1 problem to a 2nd one. Social pressure 
consisted of a group of Ss giving either unanimously correct or in- 
correct responses over a series of trials. To study transfer of the 
social pressure effect, in 1 condition the group gave veridical feedback 
on the 1st problem and nonveridical feedback on the 2nd 3 the op- 
posite order of feedback was given in another condition. Results 
showed that veridical group feedback facilitated concept acquisition 


and that nonveridical feedback depressed acquisition. Moreover, 
transfer of the social pressure effect occurred between the 2 problems, 
resulting in poorer performance on the 2nd problem. 


Research during the past decade has 
shown conclusively that social pressure 
from a group influences individual be- 
havior on a variety of simple judgmental 
tasks, for example, perceptual discrimina- 
tions (Allen, 1966). Yet, little research 
has been directed toward investigating the 
possible role of social pressure in more 
complex cognitive processes such as 
learning and remembering. The paucity of 
research on the effect of social pressure on 
complex behavior is perhaps due to the 
tendency of social psychologists to con- 
sider the nature of the task unimportant 
or irrelevant in comparison to the basic 
psychological processes under investiga- 
tion. Thus, a task is often employed solely 
because it is simple and available; many 
such tasks no doubt tap only simple psy- 
chological processes, Because of this em- 
phasis on processes rather than tasks, the 
study of social pressure has largely ne- 
glected the investigation of complex cog- 
nitive behavior. 

Our knowledge concerning the effects of 
social pressure on behavior indicates that 
the complex cognitive Processes of learn- 
ing and remembering might be particularily 
vulnerable to social influence at certain 
stages of learning. For example, the litera- 


+The research reported herein was Performed 
Pursuant to a contract with the United States 
Office of Education, Department of Health, Edu- 
cation, and Welfare under the provisions of the 
Cooperative Research Program. 


ture on social pressure shows that effects 
of the group are more pronounced when the 
task is ambiguous (Luchins, 1945; Walker 
& Heyns, 1962) or when the person has lit- 
tle confidence in his ability to make a cor- 
rect response (Hochbaum, 1954; Wiener, 
1958). During the initial phases of the 
learning process the task is quite am- 
biguous to S, and his confidence in his 
ability to respond correctly is low. At 
this stage, it is very likely that social pres- 
sure would exert a strong influence on 
learning; the effect of such social pressure 
could, of course, aid or hinder the speed 
of learning, depending on the objective 
correctness or incorrectness of the group’s 
response, 

Little research has been conducted on the 
effect of social pressure on learning and re- 
membering. Allen and Bragg (1967) 
showed that social pressure influences 
memory on a paired-associated learning 
task. One study of acquisition (Rhine, 
1960) employed a very simple learning 
situation in which Ss were asked to predict 
whether a “little known” group of people 
possessed each of a series of desirable and 
undesirable traits. Results showed that 
peer-group responses aided acquisition on 
this simple task. In view of the meager sy8- 
tematic data available, the first purpose of 
the present study was to explore the effect 
of social pressure during the acquisition 
phase of learning. Social pressure is pre- 
sented in the present study in the form of 


302 


S 


Errect oF Socal Pressure on Concepr IpEntIFICATION 


ynanimous (correct or incorrect) feedback 
from a group of S’s peers. In order to avoid 
the limitations of the simple rote learning 
| situation, the concept identification task 
was chosen for use in this study. 

A second purpose of the present study 
was to investigate transfer of the effect of 
social pressure from one problem to an- 
other. Insufficient attention has been de- 
yoted to the potential sequential effects of 
group pressure. A few studies have ad- 
‘ dressed themselves to the problem in a very 
limited way by examining between-trial 
effects of social pressure on a single task. 
One study found a carry-over of social 
pressure between trials on perceptual 
judgments of numerosity of a pattern of 
dots (Fisher, Rubenstein, & Freeman, 
1956). In this study a confederate con- 
sistently gave estimates higher than S’s 
estimates. Not only was S influenced by 
the confederate’s response to the same 
stimulus, but S’s initial response on the 
next stimulus display—given prior to the 
confederate’s estimate—was also affected. 
That is, between-trial influence as well as 
within-trial influence was demonstrated. In 
a subsequent study, Peterson, Saltzstein, 
and Ebbe (1967), again using numerical 
estimates of dots, found between-trial in- 
fluence when the stooge changed his re- 
Sponse each trial in order to maintain a 
constant discrepancy from S’s preceding 
estimate. But when the stooge maintained 
4 fixed absolute estimate, no between-trial 
influence was observed. 

Relevant to the question of sequential 
effects of group pressure is Hollander’s 
(1960) research, indicating that tolerance 
of an individual’s deviation from the group 
1s a function of his earlier behavior in re- 
lation to the group. Greater acceptance of 
an individual’s attempt to change the group 
Norm was shown when the individual's 
conformity to the group occurred in the 
carlier stages of interaction, rather than at 
later stages, 

When the direction of the group’s re- 
Sponse changes over time, sequential effects 
come crucial; the possibility of transfer 
fects from one task to another then arises. 
A task having an objectively correct an- 
‘wer, such as the concept attainment task, 
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would appear to possess distinct advan- 
tages for studying transfer effects of group 
pressure. Use of such a task allows us to 
shed some light on the question of appro- 
priateness or efficiency of conformity and 
nonconformity. Much controversy exists 
concerning whether conformity to the 
group should be considered desirable or un- 
desirable behavior. Under certain circum- 
stances, conformity to a group is un- 
doubtedly a very adaptive and appropriate 
response. To agree with a group that gives 
veridical or objectively correct responses in 
a concept identification task, and to de- 
pend upon the group when one is uncer- 
tain, would facilitate learning. By contrast, 
if the group’s responses were nonveridical 
or incorrect, to agree because of social pres- 
sure is clearly inefficient since it would in- 
terfere with learning. The study of transfer 
effects of group pressure has been ne- 
glected, but the phenomenon attains con- 
siderable importance when dealing with ob- 
jective tasks. 

Social reality is complex; the behavior 
of a group does not always remain con- 
sistent over time. Agreement with a group 
is therefore advantageous to the individ- 
ual in some circumstances and disadvan- 
tageous in others. Consider the responses of 
a group on a concept identification task. 
As pointed out earlier, agreement with the 
group would facilitate identification of the 
concept if the group supplied correct re- 
sponses. Suppose that the group’s responses 
were initially correct, and that S came to 
rely on the group. If the same group later 
began giving incorrect responses S’s con- 
tinued reliance on the group would hinder 
learning by delaying prompt adaptation to 
the new situation. Negative transfer effects 
of two types are therefore possible: (a) 
initial conformity to a correct group, fol- 
lowed by later conformity to the same 
group now giving incorrect responses, (b) 
initial nonconformity to an incorrect group, 
followed by nonconformity to the group 
now giving correct responses. 

Ideally, an individual’s behavior would 
consist of a high degree of selectivity in re- 
lation to the group. Because the group’s 
response is subject to change, selective de- 
pendence, rather than rigid conformity or 
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nonconformity, is most advantageous to the 
individual. Therefore, the most efficient re- 
lation of the individual to the group is con- 
formity when the group is correct and 
nonconformity when the group is incorrect. 

In summary, the purposes of this study 
are twofold: First, to investigate the ef- 
fect of social pressure on concept identifi- 
cation; and second, to study the transfer of 
the social pressure effect from one task to 
another. 


Meruop 


Subjects 


The Ss for the study were 73 female freshman 
and sophomore students who volunteered to par- 
ticipate in the experiment without compensation 
of any kind, The Ss were randomly assigned to the 
five experimental conditions. 


Apparatus 


The apparatus, a Crutchfield (1955) electrical 
signaling device used in conformity research, con- 
sists of five booths containing nine response 
switches and a matrix of 45 signal lights showing 
the answers given by the five Ss. Modification of 
the apparatus by relabeling switches permitted its 
use in the concept attainment task. The S is led to 
believe that she responds last in a group of five, 
and that other persons’ responses are shown in 
each booth. In actuality, all five Ss answer in the 
last position, and the signal lights shown in each 
S's booth are controlled by Z from another room. 
In this way, all Ss are exposed to the same pattern 
of group pressure, and stooges are unnecessary. 
Five Ss were always tested together. 


Material 


The learning task consisted of slides containing 
various geometric designs. The slides were pro- 
Jected on a screen 10 feet in front of Ss. Five di- 
mensions, varying on two attributes, were used in 
the concept attainment task. The dimensions 
were: (a) size—large or small, (b) shape—square 
or circle, (c) color—red or green, (d) number—one 
or two, and (e) texture—plain or textured, 


Instructions 


The Ss were told that their task was to solve a 
concept identification problem. A slide containing 
all five dimensions was first shown and the dimen- 
sions were described. The Ss were told that the 
concept to be identified would consist of one or a 
combination of the dimensions Present in each 
slide. The Ss’ task was to determine whether or 
not a slide contained the concept, and to identify 
the relevant dimensions, 

The S was further told in the instructions: 
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If you think that the slide does contain the 
concept, you should turn on switch number one 
marked “contains concept.” If you think that the 
slide does not contain the concept, turn on switch 
number two marked “does not contain concept,” 
Tf you have decided the slide contains the con- 
cept, then I want you to turn on one or more of 
the five switches which indicates the relevant or 
correct dimension. For example, if on this slide 
you believed that the correct concept was “small 
green circle,” you would first turn on switch num. 
ber one, then the switches corresponding to size, 
color, and shape (switches 5, 6, and 7). If you 
thought that this slide did not contain the con- 
cept, and therefore turned on switch number 
two, I want you to turn on the switch or switches 
for the dimension or dimensions that are incor- 
rect. For example, if you thought that the con- 
cept was “small red circle” (instead of “small 
green circle”), then the incorrect dimension 
would be color and you would turn on the switch 
corresponding to color (switch number 5). After 
you all have answered, I will tell you if the slide 
contained or did not contain the concept, Then 
we will go on to the next slide. 


Five practice trials were given, followed by 25 
test trials in each of the two series of trials. In 
the group conditions, Ss were assigned a position 
for responding and always answered in order. 
When the experiment began all Ss, unknowingly, 
were assigned to the last position, Number 5. This 
allowed E to control the simulated responses ob- 
served by S prior to her answering. The first prac- 
tice slide was always an example of the concept 
(“red-textured” for the first series, and “small” for 
the second series), 

In summary, Ss responded by pressing one 
Switch to indicate that a slide contained the con- 
cept, or a second switch to indicate that the slide 
did not contain the concept. One or more of five 
other switches in S’s booth, labeled by dimension 
(color, shape, etc.), were used by S to indicate 
correct dimensions if the slide contained the con- 
cept or to indicate incorrect dimensions if the slide 
did not contain the concept. After each trial, E 
reported whether or not the slide contained the 
concept, but he did not give information concern- 
ing the correctness of the dimensions comprising 
the concept. 


Design 

The five conditions used in the study are de- 
scribed below. In each condition S received two 
concept identification problems, each problem con- 
sisting of a series of 25 trials. 

Control. In the control condition, $s learned 
the two concept attainment tasks without seeing 
the other five Ss’ answers to the 25 slides in each 
series. Twenty-one Ss were used in this condition. 

Veridical. In this condition, the feedback Ss 
Teceived was mostly correct for each of the two 
concept attainment tasks. On the first 12 slides 
of the first series of 25 trials, there was some dis- 


agreement shown among the simulated Ss in order 
to increase credibility of the situation. But after 
the 12th slide the simulated Ss appeared unani- 
mously to choose the correct concept, and adhered 
to the concept for the remainder of the series. The 
same sequence of trials was used for supplying 
feedback in Conditions 4 and 5 below. Fifteen Ss 
were used in this condition. 

Nonveridical. The responses of others that S 
observed in this condition were incorrect on both 
problems. On the first 12 slides there was disagree- 
ment shown by the simulated Ss in their incorrect 
responses. But subsequent to Slide 12 all Ss agreed 
on the concept, giving identical wrong answers for 
each slide. The same sequence of incorrect trials 
was used for incorrect feedback in Conditions 4 
and 5 below. Thirteen Ss were used in this condi- 
tion. 

Veridical-nonveridical. In this condition Ss re- 
ceived correct feedback from the group on the 
series of 25 trials comprising the first problem, and 
‘incorrect group feedback on the second 265 trials 
for the second problem. Eleven Ss were used in 
_ this condition. 

Nonveridical-veridical. In this condition Ss re- 
teived incorrect group feedback on the first prob- 
‘lem, but correct group feedback on the second 
problem. Thirteen Ss were used in this condition. 


Resutts 


The most straightforward overall analy- 
sis of the data consists of calculating the 
percentage of Ss in each feedback condi- 
tion who correctly identified the concept 
used in the two problems by the end of 
each series of 25 trials. For this analysis 
data will be combined for the two veridical 
feedback conditions, and for the two non- 
Veridical conditions. 
_ Table 1 presents results for the first con- 

cept identification problem. Data in Table 
1 show that on the first task 88% of the 
Ss in the veridical feedback condition cor- 
 Tectly identified the concept, as compared 
with 23% in the nonveridical feedback 
7 Condition. The difference between the 
Yeridical and nonveridical feedback condi- 
"Hons was statistically significant at less 
= the .01 level (x2 = 21.98, df = 1). In 

the control condition, where S could not 

Observe other persons’ responses, 43% of 
the Ss identified the concept. In the veridi- 
¢al feedback condition Ss’ performance 
Was significantly better than in the control 
Condition (x2 = 11.11, df = 1, p < .01); 
but the decrease in performance of Ss in the 
Nonveridical feedback condition, relative to 
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TABLE 1 
Concert ATTAINMENT ON THE First PROBLEM 


Condition N Correct | Incorrect 
Control 21 43 57 
Veridical feedback 26 88 12 


Nonveridical feedback 26 23 7 


Note—Entries are percentage of Ss in each 
condition who correctly identified the concept, 
and the percentage who failed to do so. 


the control, was not statistically signifi- 
cant (x? = 2.09, p < .20). 

Results for the second concept identifi- 
cation problem were congruent with results 
for the first problem. It can be seen in Ta- 
ble 2 that veridical feedback from the 
group improved performance while non- 
veridical feedback depressed performance. 
In the veridical feedback condition, 79% 
of the Ss correctly identified the concept, 
as compared with only 12% in the non- 
veridical condition. Results for the control 
condition fell between the two experimental 
conditions (48%). The difference between 
the veridical and nonveridical feedback 
conditions was statistically significant at 
less than the .01 level (x? = 22.59, df = 
1). In addition, scores in the veridical 
feedback condition were significantly bet- 
ter than in the control condition (x? = 
6.58, df = 1, p < .02), and scores in the 
nonveridical feedback condition were sig- 
nificantly poorer than in the control con- 
dition (x? = 4.56, df=1,p <.05). 

In summary, the highly significant differ- 
ences observed as a function of type of 
group feedback indicate that social pres- 
sure affected concept attainment, with 
yeridical group responses facilitating per- 


TABLE 2 
Concert ATTAINMENT ON THE SECOND PROBLEM 


Condition N Correct |Incorrect 
Control 21 43 57 
Veridical feedback 28 2 a 


Nonveridical feedback pz 


Note.—Entries are percentage of Ss in each 
condition who correctly identified the concept, 


and the percentage who failed to do so. 


) 
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formance and nonveridical responses in- 
terfering with performance. 

A second problem of interest in this 
study was the transfer of the group’s effect 
on concept attainment from the first task 
to the second one. Recall that in the 
veridical-nonveridical condition, the group 
gave correct responses on the first task, but 
incorrect answers on the second task. The 
opposite inconsistent order of group feed- 
back was followed in the nonveridical- 
veridical condition. The remaining two 
experimental conditions, in which the direc- 
tion of the group’s responses remained 
consistent across the two problems, pro- 
vided a base line against which transfer of 
the group’s inconsistent feedback across 
problems could be assessed. 

Results showed that transfer effects were 
clearly evident. Data were first examined 
for the two groups that received veridical 
feedback on the second task. Feedback in 
one of thése conditions (veridical) was also 
correct on the first task, but in the other 
condition (nonveridical-veridical) feed- 
back was incorrect on the first task. One 
would predict that transfer of the effects of 
incorrect group feedback given on the first 
task would detrimentally affect concept 
learning on the second task; in other words, 
negative transfer should occur. Results 
showed that the mean trial on which the 
concept was correctly identified when both 
tasks received veridical group feedback 
was 15.2, as compared with 17.7 when the 
first task had received nonveridical group 
feedback. The difference between the two 
conditions was significant at beyond the 
.05 level of confidence by the one-tailed ¢ 
test (¢ = 1.98). 

To measure the transfer effect of veridi- 
cal group feedback when the second task 
received incorrect feedback, it was neces- 
sary to analyze the data somewhat differ- 
ently. Using the mean trial on which the 
concept was attained was not feasible be- 
cause so few Ss correctly identified the con- 
cept when incorrect feedback was given on 
the second task. Therefore, the mean num- 
ber of times Ss agreed with the incorrect 
responses of the group on the second task 
was used as an index of the transfer ef- 
fect. Performance on the second task re- 
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ceiving nonveridical group feedback was 
then analyzed as a function of whether the 
group’s feedback had been veridical or non- 
veridical on the first task. It was predicted 
that agreement with the group’s incorrect 
responses on the second task would be 
higher when feedback from the group on 
the first task had been correct than when 
group feedback on the first task had been 
incorrect. Results supported the prediction: 
The mean number of trials on which Ss 
agreed with the incorrect responses of the 
group on the second task was 11.5 for the 
veridical-nonveridical condition, as com- 
pared with a mean of 7.5 for the nonveridi- 
eal condition. The difference between the 
two conditions was significant at the .05 
level by a one-tailed ¢ test (¢ = 1.82). 


Discussion 


Results of the present study have shown 
that social pressure, in the form of un- 
animous responses of a group of peers, sig- 
nificantly affects behavior on a concept 
identification problem. The strength of the 
effect, of social pressure on concept acquisi- 
tion appears to be asymmetrical. Interest- 
ingly and, perhaps encouragingly, the 
amount of the facilitating effect of social 
pressure in the form of veridical group 
feedback was approximately twice as great 
as the amount of the detrimental effect due 
to nonveridical group feedback. The 
greater effect of correct feedback than of 
incorrect feedback is in accord with a 
study by Jones, Wells, and Torrey (1958), 
in which the # provided objective feed- 
back to the group. 

It should be emphasized that the amount 
of facilitation of concept acquisition at- 
tributable to veridical group feedback was 
not insubstantial. In the veridical feedback 
condition 89% of the Ss accurately identi- 
fied the concept as compared with 43% in 
the control condition, an advantage of 46% 
attributable to the group’s feedback. 

Whether the effect of social pressure 
was due to mere public agreement with the 
group or represented the individual’s true 
belief is difficult to determine with cer- 
tainty. The problem-solving situation 18 
one that would primarily tap informa- 
tional rather than normative social in- 


fluence (Deutsch & Gerard, 1955). That 
, agreement with the group was probably 
“due to S’s using the responses of other per- 
sons as reliable sources of information 
bout a solution to the problem, rather than 
to an attempt on S’s part to gain approval 
‘or avoid disapproval from the other group 
members. Instructions concerning the ex- 
‘periment and the nature of the task both 
‘served to orient Ss toward utilizing other 
members of the group as informational 
Tather than normative sources of influence. 
‘$0 it is very plausible to interpret the in- 
fluence of the group as being due primarily 
to informational influence, and not to mere 
public compliance to the group. 
It is interesting that although the task 
Was equally unfamiliar to all group mem- 
hers, S was willing often to agree with the 
answers given by the group. No doubt such 
igreement served to reduce S’s motivation 
to search for the solution to the concept 
identification problem. Unanimity among a 
group of persons often means that their re- 
Sponses are correct; initial acceptance of 
Such an assumption probably led S to 
Place undue dependence on the group. 
“Agreement with an apparently self-confi- 
} dent group could have caused a decrease in 
“Cognitive arousal on the part of S. As a con- 
Sequence, S probably relaxed somewhat 
4nd exerted less cognitive effort in finding 
the solution to the problem. Such relaxa- 
tion is perhaps also partially due to S’s 
Acquiring a “set” of agreeing with the 
“Stoup which is difficult to break. Like Ss 
Luchin’s (1942) water-jar problem, once 
ne set is established, a new and critical 
Analysis of the problem is accomplished 
Slowly and with difficulty. The cognitive 
a to agree with the group can sometimes 
Clearly aid S’s problem-solving attempts 
qually often serve as a barrier, depend- 
4 on the degree of veridicality of the 
P’s responses, 
fe Be cence of transfer of the group’s ef- 
to A s concept acquisition from the first 
nd e second task is a very intriguing 
itding. When the group had previously 
Siven correct responses on the first task, 
S were likely to continue agreeing with 
© group on the second task, although the 
Sup now gave incorrect responses. Con- 
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formity to the group at this time was in- 
appropriate because the group’s behavior 
was inconsistent with its previous veridical 
responses. Similarly, Ss were unnecessarily 
inefficient when the group had given in- 
correct responses on the first problem, 
but changed to supplying correct answers 
on the second; in this case, Ss conformed 
less than was warranted by the group’s 
veridical answers given at that time. 

The answer to the value question of 
whether conformity is desirable or unde- 
sirable obviously is shown in this study to 
depend on the specific characteristics of the 
situation. Appropriate and efficient be- 
havior would consist of an individual’s 
conforming to a group sometimes on some 
issues, and disagreeing at other times on 
other issues. The difficulty of increasing 
selective response to group pressure is, how- 
ever, very real. The tendency is strong to 
respond consistently to a group (or person 
or situation), even though rational analy- 
sis would dictate a change in response. 
The transfer phenomenon observed in the 
present experiment appears to be a special 
case of a more general psychological phe- 
nomenon found in other contexts. For 
example, the halo effect observed in pres- 
tige suggestion is a case of behaving con- 
sistently toward an individual across sit- 
uations, though a change in behavior would 
be the more reasonable response (Aronson 
& Golden, 1962). A source having high 
prestige on one topic tends to produce un- 
warranted agreement on another topic on 
which he has little competence; similarly, 
a source having low prestige on one topic 
often produces lower agreement than war- 
ranted on a second topic on which he has 
limited competence. : 

Transfer effects of the type found in the 
present experiment are probably not un- 
common in everyday social behavior, but 
the problem remains to be systematically 
explored in future research. 
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SUCCESSIVE VERSUS CONCURRENT PRESENTATION OF 
MULTIPLE GRAPHEME-PHONEME 


CORRESPONDENCES’ 


JOANNA P. WILLIAMS 
University of Pennsylavania 


Many new instructional programs recommend that the beginning 
reader be given material built on a simplified, regularized pattern of 
1:1 grapheme-phoneme correspondences. To test this assumption, 2 


methods of training multiple correspondences (1 grapheme mapping 
to 2 phonemes) were compared in a paired-associate paradigm. In 
successive training, only 1 of the 2 phonemes associated with a par- 
ticular grapheme was presented at a time, while in concurrent train- 
ing, both phonemes associated with each grapheme were introduced 


and practiced concurrently. Results suggest 


that concurrent training 


is superior, both in terms of the kind of “set” developed and of the 
performance level on the correspondences given in training. These 
findings run counter to the typical recommendations. 


One of the fundamental concepts that 
must be developed by the beginning reader 
of English—or of any language based on 
the alphabetic principle—is that there is a 
correspondence between spoken language 
and orthography. There is some question 
as to what the critical unit of correspond- 
ence actually is, for each individual letter 
does not necessarily correspond to a pho- 
neme (Gibson, Gibson, Pick, & Ossen, 
1962; Hall, 1961). While recent analyses 
suggest that English orthography is a more 
tegular—and more complex—system than 
had heretofore been recognized (Weir & 
Venezky, 1965), nevertheless, even when 
combinations of graphemes (spelling pat- 
terns) are considered as the basic unit, 
aoe is not a one-to-one correspondence. 

‘he irregularities add considerably to the 
difficulty of learning to read. 

: Because of these difficulties, it has been 
ha that only one-to-one correspond- 
ie be presented to the beginning reader. 
is argued that starting with a simplified 
pattern and later introducing irregularities 
ee 
Coons rommarch was supported in part by the 
ace aie Research Program, United States Of- 
Pcie lucation, and in part by a faculty research 
tant from the University of Pennsylvania. A pre- 
eitedice note of some of these data was Pre- 
petranpeee  S 
Diindebta “= search Association. The author 
Pollseu 0 Ellen I. Levy for her assistance In 
‘ecting and analyzing the data. 


will make the child understand more read- 
ily the notion that English is basically an 
alphabetic language. Moreover, such a 
training procedure would presumably also 
lead to more efficient mastery, in the long 
run, of all the correspondences that are to 


_be found in the language. A great many of 


the new approaches to reading instruction 
subscribe to this idea, for example, the 
linguistic methods of Bloomfield (1942) 
and of Fries (1963) and the “initial teach- 
ing alphabet” approach (Downing, 1963). 
On the other hand, there is also some 
justification for taking the position that at 
least some multiple correspondences should 
be presented right from the start of in- 
struction. The child must indeed learn 
that the orthography relates to the spoken 
language, but he must also learn that there 
are alternative spellings for most phonemes 
and alternative phonemes for many spell- 
ings. Indeed, Levin and Watson (1963) 
have suggested that a child who is pre- 
sented with multiple correspondences in 
the initial stages of instruction will be 
more likely to develop a useful problem- 
solving approach to reading (i.e., a “set 
for diversity”). That is, if he is aware that 
there is more than one phoneme associated 
with a particular grapheme, he will be 
likely to try out a variety of pronuncia- 
tions when he is faced with an unfamiliar 
word. In addition to the possibility that an 
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Fic. 1. The graphemes. 
effective “set” would be shaped through 
this kind of training, it is. also possible 
that the correspondences themselves might 
be learned more effectively in this manner. 

In the only data relevant to this ques- 
tion, Levin and Watson (1963) have shown 
that third-grade children learned a transfer 
list of words more readily after training on 
variable correspondences, as compared to 
training where the correspondences were 
constant. 

When instructional methods are studied 
in terms of simple laboratory paradigms, 
it seems especially important to consider 
not only the effectiveness of training but 
also its efficiency. A low error-rate during 
training or superior performance on a test 
may mean very little in terms of applica- 
tion to actual instruction, if the extra gain 
has been due to a greatly extended training 
time. In Levin and Watson’s experiments, 
the original training was continued until 
Ss reached a performance criterion. The 
variable-training group took considerably 
longer to reach criterion, and perhaps the 
differences in amount of training could 
account for the differences seen in the 
transfer task, 

The present experiment was designed to 
compare two methods of teaching mul- 
tiple correspondences when both methods 
were given an equivalent amount of train- 
ing: (a) successive: only one of the two 


a 
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phonemes associated with a particular 
grapheme was presented at a time, and 
(6) concurrent: both phonemes associated 
with each grapheme were introduced and 
practiced concurrently. Single (one-to-one) 
correspondences were also incorporated 
into the learning task in order to simulate 
more closely the varied nature of the cor- 
respondences found in the actual reading 
situation. 


Exprerment I 


Method 


Materials. Each of six graphemes, similar to 
those used by Gibson et al. (1962), was printed on 
a white 5 X 8 inch card. These graphemes are pre- 
sented in Figure 1. Ten phonemes were selected, 
on the basis of pretesting, which were easily pro- 
dueed by S and readily discriminable to. H. These 
phonemes were: /m/, /a/,./y/, /iy/, /s/, /b/, 
/uw/,/t/,/%/,/é/. Two of the six graphemes, each 
of whieh had two phonemes paired with it, were 
Successive items. That is, one phoneme only was 
presented during the first half of the training ses 
sion, and the second phoneme to be associated 
with that grapheme was presented. during the sec- 
ond half of training. Two other graphemes, each 
of which also had two phonemes ‘paired with it, 
were concurrent items: both phonemes associated 
with each grapheme were presented on each trial 
throughout the entire training period. Each of the 
last two graphemes had only one phoneme paired 
with it; these single, one-to-one correspondences 
were also presented throughout training. The pa: 
ticular graphemes and phonemes used for each 
type of presentation—successive, concurrent, fee 
single—were balanced over Ss, as was the pairing 
of the graphemes and phonemes. 

Subjects. The Ss were 36 low-tract fifth and 
sixth graders. They were asked to volunteer to try 
out new materials which were being developed to 
help children learn to read. 

Procedure. The Ss were shown a sample eraph 
eme (not one used in the experiment) and ae 
that they were to learn the sounds that went wi 
such forms, They were instructed that some forms 
had more than one sound and that new sounds for 
some of the forms might be introduced later 


A modified paired-associate method was ee 
Each grapheme was presented individually 9 
random order on each trial, On the first trial, 
was simply shown each item in turn. As ea 3 
grapheme was exposed, Z pronounced the phoney 
or phonemes associated with it, and S was 3 
to repeat the sounds. Each grapheme-phont 
correspondence was given equal training ee 
That is, during the 10-second exposure of a ¢ 
current item, both phonemes were given once ea . 
and during the 10-second exposure of a success! 
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item, only the first of its two phonemes was given, 
and it was given twice. The one phoneme attached 
to each simple item was given once, and the total 
exposure was half as long (5 seconds) for each sim- 
ple item as that of the other types of items. 

After the initial trial there were three antici- 
pation trials, Each grapheme was presented for 
approximately 10 seconds, and § was required to 
give the sound or sounds associated with it. During 
this part of training, both phonemes were pre- 
sented for each concurrent item, and only the first 
of the two phonemes to be associated with each 
successive item was given. 

The second half of training consisted again of 
one initial trial in which Z pronounced the pho- 
nemes as the graphemes were exposed, and then 
three. anticipation trials. There was no break be- 
tween the first and second half of training. In this 
part of, the training, the second phoneme asso- 


ciated with each successive item was presented; - 


the other items, of course, remained the same. The 
8s were told after the presentation of each graph- 
eme the correct response for that item. Verbal 
reinforcement, such as “very good,” was given for 
items on which S responded correctly, and EZ drew 
4 star on the data sheet beside every ‘correct an- 
swer. 


Results 


Three tests were given immediately 
after training. Half of the Ss received the 
tests in 1-2-8 order; the other half, in 1-3-2 
order. In Test 1, Ss were simply shown 
each grapheme individually, and they were 
asked, “How many sounds did you learn 
for this form?” 

Table 1 presents the number of correct 
Yesponses as a function of item type and 
test order. Analysis of variance indicated 
4 significant difference among item types 
(F = 17.56, df = 2/68, p < .001). There 
Was no difference, of course, as a function 
4 order of presentation of the tests (F = 
.78, df = 1/34), nor was there any inter- 
action (F = 2.38, df = 2/68). Specific 
Comparisons showed that Ss performed 
equally well on the simple items as on the 
roncurrent items (F < 1, df = 1/68), and 
hat scores on the successive items were 
Significantly lower than on the other types 
oe versus simple: F = 28.76, 
4 = 1/68, p < .001; successive versus con- 
urrent: F = 11.68, df = 1/68, p < -005). 

: n Test 2, each grapheme was presented, 
‘i E said, “Give me all the sounds you 
earned for this form.” These data are also 
+ Presented in Table 1. Analysis of vari- 
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TABLE 1 
Numper or Correct Responses to Eacu: TrPE 
or Irem iv Experiment I 


‘Type of item (training mode) 


Test 
Single Successive | Concurrent 

Test 1 i 

Sequence a 32 19 P46 

Sequence b 29 10 31 
Test 2 a 

Sequence a 25° 30 46 

Sequence b 215 24 43, 
Test 3 r 

Sequence a 288 27 ) 

Sequence b 268 37 28 


Note.—Sequence a = Test 1, Test 2, ‘Test 3; 
Sequence b = Test 1, Test 3, Test 2... 

« Number of opportunities equals half of ‘the 
number for successive and concurrent items. . 


ance was done on Tests 2 and 3 on the two 
types of multiple correspondences only, 
because the design of the experiment made 
it inappropriate to include the simple 
items in these analyses. Performance was, 
as on Test 1, better on concurrent: items 
than on successive items (7 = 21.95, df = 
1/34, p < .001). There was no test-order 
effect, nor was there any interaction (both 
Fs less than 1; df = 1/34). : 

In Test 3, all the graphemes were pre- 
sented together in an array. The # pro- 
nounced each phoneme, and S was’ asked 
to point to the form that had been asso- 
ciated with that phoneme. After each item, 
the arrangement of the graphemes in front 
of S was changed (randomly). On this test, 
performance on the concurrent items did 
not differ between the concurrent: and 
the successive items (F < 1, df = 1/34). 
As on the other tests, there was no effect 
of test order (F < 1, df = 1/34), nor was 
there an interaction between the ‘two 
variables (F = 4.08, df = 1/34). 


Exrerment IT 


As described above, the exposure time 
for the different types of items was 
equated in this experiment. However, the 
number of separate presentations of each 
consecutive correspondence was only half 
the number of presentations given to each 
concurrent correspondence. In order to 
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TABLE 2 


Number or Correcr Responses To Eacu TyPz 
oF Ivem 1n ExpertMenT II 


Type of item 
Test 
Simple Successive | Concurrent 

Test 1 

Sequence a 32 7 29 

Sequence b 30 10 29 
Test 2 

Sequence a 276 37 31 

Sequence b 248 36 43 
Test 3 

Sequence a 25* 37 35 

Sequence b 288 39 34 


Note.—Sequence a = Test 1, Test 2, Test 3; 
Sequence b = Test 1, Test 3, Test 2. 

* Number of opportunities equals half of the 
number for successive and concurrent items. 


determine the importance of number of 
presentations, a second experiment was 
tun in which, on each trial, there were 
two separate presentations for each succes- 
sive item. This was not a control; 
rather, if the original experiment were con- 
sidered to be biased in favor of concurrent 
items because of the greater number of 
presentations, then this experiment was 
biased in favor of the successive items, 
While the number of presentations was 
now constant, the time between any two 
presentations of the same successive cor- 
respondence was, of course, much shorter. 
Thirty-six Ss were run, and the data are 
presented in Table 2. 

The overall level of performance was not 
different from that of the initial experi- 
ment (Test 1: ¢ = .134; Test 2: ¢ = 501; 
Test 3: t = .952, df = 70). Again, on Test 1, 
analysis of variance indicated a difference 
among item types (F = 10.20, df = 2/68, 
Dp < .001), no difference between test 
orders (F = 2.68, df = 1/34), and no inter- 
action (F < 1, df = 2/68). Specific com- 
parisons indicated that performance on sim- 
ple items was significantly higher than that 
on successive items (F = 30.93, df = 1/68, 
D < .001), but did not differ from that of 
concurrent items (fF < 1, df = 1/68). Per- 
formance on concurrent items was signifi- 
cantly higher than that on successive items 
(F = 24.27, df = 1/68, p < 001). 
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On Test 2, there was no difference be. 
tween successive and concurrent’ items 
(F < 1, df = 1/34), contrary to the results 
of Experiment I. There was no effect of 
test order (F < 1, df = 1/34), nor was 
there an interaction (F = 2.41, df = 1/34), 
On Test 3, there were no differences between 
the two types of items between the test 


orders, nor was there an interaction (all — 


Fs < 1, df = 1/34). 
Experiment III 


Further analysis of the original data 
(Experiment I) showed that many more of 


the consecutive correspondences that had 
been presented during the second half of — 
training were given correctly than those — 
presented during the first half (Test 2: 


t = 2.707, df = 35, p < .02; Test 3: t = 


3.734, df = 35, p < .01), That is, there was 


a recency effect. In order to equate the 
strengths of the two successive correspond- 
ences at the end of training, Experiment 
III was run, again using 36 Ss. Here, the 
six anticipation trials were divided dif- 
ferently. Instead of three trials on the 
first set of correspondences and then an- 
other three on the second set, four trials 
were given on the first set and two on the 
second, Again, the overall level of perform- 
ance (data presented in Table 3) was not 
different from that of the initial experiment 
(Test 1: t = .559; Test 2: t = .972; Test 3: 


TABLE 3 


Noumszr or Correct Responses To Hac TyPE 
or Irem 1n Expertment III. 


Type of item 
Test 
Simple Successive | Concurrent 

Test 1 

Sequence a 33 17 29 

Sequence b 31 8 34 
Test 2 

Sequence a 288 42 38 

Sequence b 21° 32 46 
Test 3 

Sequence a 318 44 389 

Sequence b ae 37 34 


Note.—Sequence a = Test 1, Test 2, Test 3; 
Sequence b = Test 1, Test 3, Test 2. 

“ Number of opportunities equals half of the 
number for successive and concurrent items. 
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t = 1.720, df = 70). The modification in 
design was effective in equating the 
strengths of the two successive correspond- 
ences, for there was no difference in the 
number of times the correspondences were 
given in the first half of training and the 
number of times they were given in the 
second half of training (Test 2: ¢ = 1.708, 
df = 35; Test 3: t = 1.484, df = 35). 

As in Experiment II, only Test 1 showed 

concurrent items significantly superior to 
successive items (overall F = 29.98, df = 
2/68, p < .001; specific comparison F 
43,79, df = 1/68, p < .001) and similar to 
simple items (F < 1) which were signifi- 
cantly superior to successive items (F = 
46.12, df = 2/68, p < .001). There were no 
differences as a function of test order 
(Ff = 1.35, df = 1/34), nor was there an 
interaction (F = 2.97, df = 2/68). 
On Test 2, neither main effect was sig- 
nificant (type of item: F < 1, df = 1/34; 
test order: F < 1, df = 1/34) and there 
was no interaction (F' = 3.72, df = 1/84). 
On Test 3, item type was not a significant 
effect (F < 1, df = 1/84), nor was test 
order (F < 1, df = 1/34). There was no 
interaction (F < 1, df = 1/34). 


Discussion 


It was suggested above that a comparison 
of these two methods of training multiple 
correspondences should be made in terms 
of (a) the level of performance on the 
Correspondences as a function of type of 
training, and (b) the kind of set developed 
y each type of training. With respect to 
the latter question, a consideration of 
Test 1 is relevant, in which S was asked 
ow many sounds went with each form. 
Presumably, if a child does not know that 
Variation is possible, he will not try out 
Several phonemes when he is attacking 4 
New word, and thus he will be less likely 
to succeed in reading that word. The data 
Indicate that Ss were better able to identify 
4 grapheme as corresponding to more than 
One phoneme if the correspondences had 
een trained concurrently. This suggests 
hat in attempting to read new words, Ss 
Would more readily identify such graphemes 
4s “multiple” and so try out more than 
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one phoneme, thereby making it more 
likely that they would read the word suc- 
cessfully. 

Further support for the idea that con- 
current training fosters the development 
of a useful problem-solving set comes from 
the results of an analysis of the errors 
made on Test 2. Errors were divided into 
two types: (a) incorrect responses, and 
(b) omissions. In Experiment I, the mean 
proportion of omissions on successive items 
was .40, whereas on concurrent items this 
figure was .13. This difference was signifi- 
cant according to a sign test (z = 2.00, 
p < .05). The same pattern was seen 
in the other experiments: Experiment II: 
successive items: .43; concurrent items; 
09; 2 = 2.00, p < .05; Experiment IIT: 
successive items: .42; concurrent items: 
12; z = 2.33, p < .05. 

These data suggest that when S knew 
that an item had two phonemes, as he did 
on concurrent items, he was more likely 
to attempt to give two phonemes, guessing 
when he was not sure. 

The second question is whether or not 
there is a difference between the two train- 
ing procedures in terms of a simple per- 
formance criterion. That is, given equal 
training time, how many correspondences 
of each type were learned? Significant dif- 
ferences did in fact appear in favor of 
the concurrent items on Test 2 (in which 
S was to give all the sounds he had 
learned for each form). However, this 
effect was small. It was also quite specific 
to the particular training conditions, for 
on Test 3 there were no differences between 
successive and concurrent training. More- 
over, the superiority of concurrent items on 
Test 2 did not appear in Experiments I 
and III. Thus, in terms of the performance 
criterion, there was in some sense @ tend- 
ency for concurrent presentation to be more 
effective, but the results were far from 
clear. 

It should be noted, however, that even 
when extra training and bias in favor of 
successive presentation was introduced 
(in Experiments II and III), scores on 
successive items equalled but never sig- 
nificantly surpassed those on concurrent, 
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items. Thus there seems to be no justifica- 
tion in the present data for the use of suc- 
cessive training methods. 

In attempting to make generalizations 
from this type of laboratory analogue to 
actual instructional situations, one must 
keep in mind the differences between the 
classroom and the experimental situation. 
One of the more important in the present 
study, as in Bishop (1964) and Levin and 
Watson (1963), is that Ss were older chil- 
dren who had already learned to read. 
Would a child who is truly naive about 
reading and the basic notion of correspond- 
ence perform similarly? 

To check this, first graders were tested 
early in their first term of reading instruc- 
tion. In order to increase the generality of 
the finding, and also because in many in- 
stances the basic unit in instruction is the 
whole word, the items were homographs, 
that is, words which have two distinct 
pronunciations and meanings but only 
one spelling, such as “wind”. Nineteen Ss 
were run. A four-item list was used, in 
which two words were presented as con- 
current items and two as successive items. 
There were no single items. Six trials were 
tun, structured as in Experiment I. Only 
Tests 1 and 2 were administered. On Test i 
the mean number correct for the successive 
items was .74, and for the concurrent 
items, 1.58. This difference was significant 
at the .01 level (¢ = 3.62, df = 18). Per- 
formance on successive items (mean num- 
ber correct = 2.53) was also significantly 
below that on concurrent items (mean 
number correct = 3.21) on Test 2 (Ez 
3.16, df = 18, p < .01). Thus as expected, 
the first graders’ results were similar to 
those of the main experiments, 

Of course, much more research is neces- 
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sary in order to apply such findings ag 
these to actual instruction. For one thing, 
few programs teach letter-sound corre- 
spondences in isolation. Further work is in 
progress, focusing on spelling patterns pre- 
sented in the context of words (as, for 
example, presented in Fries’, 1963, reading 
materials). At present, however, in the 
absense of sufficient data on which to 
base a final decision, it would seem rea- 
sonable to provide at least some variation— 
some kind of concurrent training—when 
presenting multiple grapheme-phoneme cor- 
respondences. 
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MANIPULATING THE EFFECTIVENESS OF A SELF- 
INSTRUCTIONAL PROGRAM 


JAMES C. MOORE 
University of New Mezico 


3 principles of programmed instruction defined as (a) gap, (b) 


irrelevancies, and (c) mastery were systematically 


varied in 8 versions 


of a self-instructional program on test-taking strategies. The criterion 
of interest was the number of guessing responses an S made under 
each of 4 test conditions. When content identical or similar to ques- 
tions on the criterion tasks was removed from the instructional 
materials to create a gap, a significant decrement in performance oc- 
curred. When irrelevant material was added to the instructional con- 
tent and was required to be mastered, a decrement in performance 
resulted. However, material containing irrelevant instruction which 
was not required to be mastered did not result in performance 
decrements. Mastery also interacted with the gap effect. 


Although programmed instruction has 
attracted a good deal of attention, the re- 
search directed toward the development of 
programming principles has been slight. 
A number of publications (Glaser, 1964; 
Klaus, 1961; Mager, 1962; Stolurow, 1961; 
Walther & Crowder, 1965) have discussed 
and, in part, have demonstrated “prin- 
ciples of programming.” Few experimental 
studies have been reported that have at- 
tempted to evaluate the effect on per- 
formance of the absence or presence of 
Programming principles singularly or in 
combination by manipulating them within 
the content of self-instructional material. 

The present study systematically 
varied three defined classes of variables in 
a self-instructional program. It was hy- 
Pothesized that varying program vari- 
ables in a controlled manner would provide 
data relating the contribution of these 
variables to program effectiveness. Con- 
ceptually, the research rationale employed 
was quite similar to that suggested by 
McClelland (1965). He proposed that once 
& number of factors had been identified as 
; ant in producing a substantial ef- 
ect with all factors working together to 
Produce it, each factor should then be 
studied alone to determine its single effect. 
After a substantial effect has been demon- 
Strated with multiple factors working to 
Produce it, that part of the treatment 
Which deals with each of the factors would 
be subtracted to discover if there is a sig- 
Nificant decline in effect. It should also 


be possible to omit several factors in vari- 
ous combinations to study interaction ef- 
fects. The present study applied this con- 
ceptual framework to research on principles 
identified as important in self-instruction. 


PRINCIPLES TO BE INVESTIGATED 


Silberman, Coulson, Melaragno, and 
Newmark (1964) studied exploratory re- 
search and individual tutoring techniques 
with the expressed objective of discovering 
empirical programming methods which 
would contribute to the theory of pro- 
gramming. Four self-instructional pro- 
grams were intensively studied and re- 
vised using tutoring techniques with 
individual students to create modified 
programs significantly superior to the 
original ones. Operations performed on the 
different programs which led to improved 
student performance were then compared 
to isolated operations common to all four 
programs and, therefore, likely to be com- 
mon to the design of any instructional 
material. Three principles were isolated: 
(a) gap, (b) irrelevancies, and (c) mastery. 
The principles refer, basically, to the addi- 
tion, elimination, and repetition of instruc- 
tional material and thus may be identified 
with specific structural features of the 
material. 

Although these principles may be in- 
tuitively obvious on an a priori basis, 
they bear the added strength of having 
been induced empirically and can be ex- 
perimentally manipulated to determine 
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how their presence or absence within in- 
structional material affects learning per- 
formance. 


Metxop 


Self-Instructional Materials 


The basic program used in the study was the 
Test-Taking Strategy program (Moore, Schutz, & 
Baker, 1966). The program was designed to develop 
optimal test-taking strategy. The intrinsic pro- 
gramming technique was used as the instructional 
method (Crowder, 1959). Specifically, the program 
deals with how to respond to test questions under 
the following conditions: (a) time limit with no 
penalty for guessing, (b) time limit with penalty 
for guessing, (c) no time limit with penalty for 
guessing, and (d) no time limit with no penalty 
for guessing. The program explains the operation 
of correction formulas and describes their effect on 
test scores. Nine instructional units associated with 
the following headings are contained in the pro- 
gram: 

A, Introduction to scramble book and subject 

‘matter 

B. Tests with no correction for guessing 

C. Tests with correction for guessing 

D, Efficient use of testing time 

E. Rationale for scoring formulas 

F, Time limits 

G. Simulated directions and items under a va- 

tiety of conditions 

H, Review of rules 

hc 

e readability of the program is approximatel: 
the fifth-grade level as analyzed DA using the 
Thorndike and Lorge (1944) index. In total, the 
text is 78 pages in length. By responding to all 
options correctly with no remedial branching, 39 
of the 78 pages would be read. If incorrect re- 
Sponses are chosen to all options, 61 pages would 
be read. Consequently, most students read between 
40-60 pages. 


Design and Hypotheses 


A 2 X 2 X 2 factorial design (Winer, 1 
228-258) was used to au cack of Soa ne 
principles: (a) gap, (b) irrelevancies, and (c) 
mastery; and to test the hypotheses generated. 
Each of the principles was a main effect with two 
levels. Five 2 x 2 X 2 analyses were conducted, 
one for each of five criterion scores, Hypotheses 
tested Were specific to the design. That is, in each 
analysis, three hypotheses were tested for the three 
main effects. The three main effects generated four 
interaction hypotheses, Hypotheses were tested 
for significance at the .05 level. 


Statement of Principles and Manipulation 
of Program Content 


: Gap principle. This principle refers to explicit 
inclusion in the program of instructional units in- 
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suring that all criterion skills are covered. The gap 
principle asserts that criterion performance is in- 
creased by the absence of gaps and decreased by 
the presence of gaps. Two types of frames* appear 
to fill gaps in a program. 

1. Frames logically related to the criterion task, 
which a content analysis indicates are intermediate 
steps to learning the criterion task. 

2. Terminal frames identical or similar to ques- 
tions on the criterion test, that is, requiring similar 
responses, or containing similar item content, or 
both. 

The principle was studied by holding frames 
associated with Type 1 constant while removing | 
frames associated with Type 2. Thus, the discrim- 
inative cues for planning a test-taking strategy 
were provided for (Type 1 frames), but evidence 
of having learned the skill by application (Type 2 
frames) was not required. The remaining instruc- 
tional material, however, covered the subject mat- 
ter logically related to the criterion task. There- 
fore, only the opportunity for application, of the 
concepts covered was removed, : 

Irrelevancies principle. Irrelevancies refer to 
unnecessary or distracting frames present ina 
program. The irrelevancies principle asserts that 
the presence of irrelevant material will decrease 
criterion performance. Two types of irrelevant 
material were studied. 

1. Frames not contributing to criterion per- — 
formance even though they might possess face 
validity as necessary steps in reaching criterion 
skills. 

2. Frames not at all similar to the criterion task 
in the nature of the responses required, or in the 
item content, or both. 

Irrelevant frames are analogous to the two 
types of items that fill gaps, but provide instruc- 
tion for tasks that are not objectives of the pro- 
gram. While irrelevant frames might possess face 
validity, a task analysis would indicate that the 
frames are not necessary for the attainment of the 
desired instructional outcomes. ‘ 

Two kinds of irrelevant information were m- 
troduced to study the effect of the irrelevancies 
principle. One type of irrelevant information was 
instruction on subtasks which were judged to be 
irrelevant to the terminal objectives of the test- 
taking strategy program, The subtasks, however, 
were related to the subject of testing. One subtask 
concerned the differences between teacher-made 
and standard tests while the other subtask con- 
cerned the differences between objective, comple- 
tion, and essay tests. The other kind of irrelevant 
material was the introduction of frames pertain” 
ing to correction formulas. While it was Ju 
necessary to provide instruction on the general 


*For the experimental materials, “frame” wa 
defined as a complete written passage which wa’ 
informational, or which required the learner 10 
make a response by choosing one of several mul 
tiple-choice answers after reading the frame. The 
majority of frames were 50-150 words in length. 
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concept of correction formulas as being logically 
related to the criterion task (Type 1 frame under 
the gap principle), it was judged irrelevant to re- 
quire the learner to know the details underlying 
the concept or to apply the formulas to any data. 
Thus, in addition to irrelevant instruction on the 
two subtasks dealing with kinds of tests, this 
treatment included a large irrelevant proportion 
of instruction on scoring formulas. 

Mastery principle. This principle refers to the 
provision for mastery of each instructional unit 
within the program by each learner. Mastery 
should increase criterion performance while non- 
mastery should decrease performance, The mastery 
principle was studied by removing frames asso- 
ciated with alternate amounts of practice as well 
as all remedial branches. This treatment was 
analogous to a sequential text in that the learner 
was not required to demonstrate mastery of each 
concept before proceeding in the program. 


Subjects 


_ The Ss were eighth-grade students. This par- 
ticular grade and age group was selected because 
they had had experience in taking standardized 
tests as part of the regular school curriculum, but 
had had no formal instruction in the subject matter 
presented in the programmed materials used. 
Since the materials were written at approximately 
the’ fifth-grade reading level, data were excluded 
for those Ss whose reading performance was below 
the fifth-grade reading equivalent as measured by 
the Paragraph Meaning Test of the Stanford 
Achievement Test, Form JM. This test, with the 
remainder of the Stanford Achievement Battery, 
was administered as part of the school’s regular 
testing program 3 months prior to data collection 
for the study. Thus, the sample was homogeneous 
to the extent that all Ss were at the same 
level, of an age range typical to the grade, and 
evidenced reading performance at the fifth-grade 
level or higher. 
ie total of 184 students served in the study 
M ich resulted in 23 Ss under each of the eight 
Sy aa conditions prescribed by the design. 
: he number of Ss was determined by a technique 
Renee by Federer (1955, pp. 73-76) which was 
sed to compute the approximate replications 
Necessary to assure adequate statistical power. Es- 
sentially the technique allows E to estimate the 
eee of Ss required to achieve a desired level 
e Probability for avoiding a Type II error. Power 
Of buted for .95 which indicated a minimum 
20 replicates necessary per treatment condition. 


’ 


Criterion Instruments 


ie criterion of interest was the learner's num- 
on ae Suesses under each of four kinds of test 
guess’, (@). time limit with no penalty for 
ihe, Ge (b) time limit with penalty for guess- 
and (a) no time limit with penalty for guessing, 
A, SS No time limit with no penalty for guess- 
ig. order to measure these criteria, four gen- 
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TABLE 1 
PENALTY AND TimE ConpITIONS oF THE FouR 
Criterion Measures AND DzstRaBLE 
Guessine Strategy 


Penalty | Power | Speed | Hi Low 
reat fi den test test hae guessing 
A Yes Yes No No Yes 
B No Yes No Yes No 
Cc No No Yes Yes No 
D Yes No Yes No Yes 


eral-information tests of 30 questions each were 
constructed. Ten of the questions on each test were 
of moderate to low difficulty for the sample group. 
Five questions were of such high difficulty that it 
was extremely unlikely that Ss could respond ex- 
cept by guessing. Fifteen questions were nonsense 
questions which had face validity, but any response 
to them was inferred to be a guessing response. An 
example of a nonsense question was: | 


Alaphite mining is an important industry in 
a. Brazil b.Panama  c¢. Chile 


The four criterion measures were randomly 
ordered which resulted in the following order of 
administration: ) 

Test A—no time limit, penalty for guessing. 

Test B—no time limit, no penalty for guessing. 

Test C—time limit, no penalty for guessing. 

Test D—time limit, penalty for guessing. 

The content of each criterion test was similar with 
only the directions regarding penalty and time 
contingencies varied. Time limits were assigned 
as a result of pilot studies. The data for the study 
consisted, therefore, of the guessing scores on the 
four criterion tests for each S, Guessing scores 
were tabulated by counting the number of high- 
difficulty and nonsense questions responded to on 
each test. This resulted in four guessing scores 
for each §. Since there were 5 high-difficulty and 
15 nonsense questions on each test, the highest 
guessing score possible was 20. The remaining 10 
questions of low to moderate difficulty were not 
tabulated. Table 1 reviews the penalty and time 
characteristics for each test in addition to indicat- 
ing what the optimal guessing strategy would be 
under each condition. i 

It may be observed from Table 1 that optimal 
guessing scores would be little or no guessing on 
Tests A and D because guessing was penalized, 
and high or total guessing scores on Tests B and C 
because of no penalty for guessing. 

Since the criterion of interest was the number 
of guesses under specific test conditions on the 
four tests, it was possible to determine an overall 
total “appropriateness” score for each individual. 
This score was computed by giving a point for 
each nonsense and high-difficulty question not 
answered on Tests A and D since there was a 
penalty for guessing, while on Tests B and C a 
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TABLE 2 


Descriptive Statistics ror Criterion Tests A, B, C, D, anp Toran 
BY TREATMENT CELL FoR Each oF THE Five ANALYSES 


Myes 


Test Ino 


x SD 
Gr 
N 5.17 | 5.91 
B 19.00 | 3.86 
C 17.91 | 4.06 
D 5.83 | 6.65 
Total | 65.48 | 13.30 
Gyes 
te 9.30 | 7.90 
B 14.78 | 6.71 
C 12.87 | 7.37 
D 6.83 | 6.75 
Total | 51.52 | 14.82 | 47.43 


Mno 
Ino Tyes 
SD x SD x SD” 
6.05 9.04 7.66 10.87 6.77 — 
1.97 17.34 5.55 17.39 5.51 
4.37 16.70 4.59 17.57 3.46 
6.75 8.57 7.19 10.44 7.62 
12.40 56.43 17.68 53.49 14.04 
6.16 13.26 6.63 10.70 7.24 
5.89 17.04 4.32 17.70 4.62 
6.16 14.30 5.63 15.78 5.56 
7.89 10.26 6.54 8.17 6.70 
15.11 48.26 10.66 52.00 13.14 


Note.—A low guessing score was desirable on Tests A and D because of the penalty for guessing 


condition, while a high score was desirable on Tests B and C because of no penalty for guessing. A high 
total score was desirable. Abbreviated: M = mastery, I = irrelevancies, G = gap. 


point was given for each nonsense and high-diffi- 

culty question answered because there was no 

penalty for guessing. Therefore, the higher an in- 

dividual’s appropriateness score, the better was his 

intel eneetine performance for all four tests com- 
ined. 


Procedure 


Prior to the collection of data, Z consulted with 
the participating teachers. They were instructed in 
their procedural responsibilities in administering 
the materials and in answering questions relevant 
to the nature of the study. 

Data were collected the following week. The ex- 
perimental programs were arranged in sequential 
sets of eight, that is, each set had one booklet for 
each treatment. When the materials were admin- 
istered, the teacher distributed the booklets by 
starting with the row of students on his right, giv- 
ing out the booklets by sets from right to left 
around the room. Each treatment condition was 
present in every room. Time was allotted for all 
Ss to complete their material. The criterion in- 
struments were administered 2 days later. Standard 
directions were used by the teachers in administer- 
ing both the experimental programs and the cri- 
terion tests. 


ReEsuits 


Table 2 presents the descriptive sta- 
tisties by treatment cell for each of the 
five 2 X 2 x 2 analyses of variance, The 
critical F value for all tests at the .05 level 
of significance was F = 3.90, df = 1/76. 

The F value associated with the gap main 


effect was statistically significant in four of 
the five analyses (for Test A, F = 6.75; for 
B, F = 4.94; for C, F = 18.17; and for 
Total, F = 15.80). Neither the irrele- 
vancies effect nor the mastery effect was 
significant in any of the five analyses. Two 
of the interactions proved significant. be- 
yond the .05 level, Irrelevancies x Mas- 
tery (for Test A, F = 4.85) and Gap X 
Mastery (for Test B, F = 4.94). The na- 
ture of the Irrelevancies x Mastery inter- 
action may be observed in Table 3 by 
inspecting the means of the cells associated 
with the interaction. The analysis of 
variance for the simple effects of the two 
levels of mastery for each level of ir 
relevancies yielded an F of 8.14 (p < 05). 

Table 4 shows the means associated with 
the Gap x Mastery. The analysis for 
simple effects of the two levels of mastery 


TABLE 3 


TRRELEVANCIES X Mastery CELL MEANS FOR 
Tast A ANALYSIS 


Trrelevancies 
Mastery Se 
No Yes 
a i2 ee 
Yes 7.24 11.39 
No 11.15 10.78 
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TABLE 4 


Gar X Mastery Cit Mmzans For 
Tzst B ANALYSIS 


Gap 
Mastery 
No Yes 
Yes 19.17 15.83 
No 17.37 17.37 


for each level of gap resulted in an F of 
9.88 (p < .05). 


Discussion 


The most predictable of the findings was 

the effect produced by introducing a gap 
into the instructional material. The most 
interesting results, however, are those as- 
sociated with the interaction effects of the 
mastery and irrelevancies principles. When 
material irrelevant to the objectives was 
introduced and mastery of it was required, 
Ss employed significantly poorer guessing 
strategies. On the other hand, irrelevant 
material, when presented, but not required 
to be mastered, did not produce a detri- 
mental effect on performance. The follow- 
ing tentative conclusions appear appro- 
priate: 
1. If the task to be learned can be 
simulated, instructional materials should 
include tasks identical to or highly similar 
to the criterion task as part of the pro- 
gram. Although learning appears to result 
from instruction which does not include 
simulation of the criterion task, it occurs to 
8 lesser degree (gap principle). 

2. If information is included in the in- 
structional material which a task analysis 
Indicates is not essential to reaching the 
terminal objectives, mastery of the non- 
€ssential material should not be required. 

Owever, if irrelevant material with face 
validity is introduced and is not required 
‘ be mastered, there is no apparent 
ecremental effect on learning (Irrele- 
vancies x Mastery interaction). 
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3. If gaps are not introduced into the 
instructional material, mastery of the ma- 
terial should be required. However, if gaps 
are introduced, mastery of the remaining 
material does not appear to benefit instruc- 
tion (Gap X Mastery interaction). 

4, The principle of mastery of instruc- 
tional material should be used selectively 
to require mastery only of that instruction 
which a task analysis indicates is essential 
to obtaining the terminal objectives. 
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ATTITUDINAL AND INTELLECTUAL CORRELATES OF 


ATTENTION: A STUDY OF FOUR 
SIXTH-GRADE CLASSROOMS* 


HENRIETTE M. LAHADERNE? 
University of Chicago 


Data collected from 4 6th-grade classrooms (N = 125) were exam- 


ined to determine whether children’s attentiveness in class was related 
to their attitudes toward school on the one hand, and to achievement 
and ability on the other. Each pupil’s attention to the main class ac- 
tivity was recorded over a 2-month period, questionnaires assessing 
the attitudes were administered, and IQ and achievement-test scores 
were obtained from school records. There was practically no relation 
between students’ attitudes and measures of attention; however, a 
positive relationship was found between measures of students’ atten- 
tion and scores on achievement and intelligence tests. In sum, all of 
the pupils in a classroom may have been subjected to the pressures 
for attention but the extent to which they responded appears tied to a 


general ability variable rather than to an attitudinal one. 


Teachers gauge the success of their 
teaching not so much by the scores their 
pupils attain on achievement tests as by 
the involvement pupils demonstrate during 
ongoing class activities (Jackson & Bel- 
ford, 1965). They assume that if a child 
is engrossed in an activity, he is getting 
something out of it even if that “some- 
thing” is not identifiable or measurable. 
This way of looking at things may seem 
at odds with recommended procedures of 
evaluation but it makes sense when con- 
sidered in the social context of the class- 
room, 

In social gatherings, the individual com- 
municates his esteem and attachment for 
the other members as well as for the 
situation itself by giving or withholding 
his attention to the activity at hand 
(Goffman, 1963). From this viewpoint, it 
is not surprising that the teacher, as a 
leader of a social gathering and responsi- 
ble for engaging the pupils, should be alert 


* Expanded version of a Paper presented at the 
American Educational Research iation Con- 
vention, New York, New York, February, 1967. 
The research reported herein was performed pur- 
suant to a contract with the United States Depart 
ment of Health, Education, and Welfare, Office 
of Education, under the provisions of the Co- 
operative Research Program. The author is grate- 
ful to Philip W. Jackson for his aid and encourage- 
ment. 

* Now at the Institute for the Development of 
Educational Activities, Inc. 


to the behavioral cues of involvement. 
Moreover, the flow of classroom life can- 
not wait for a delayed measurement. 
Small wonder, then, that standard test 
scores offer less immediate feedback to the 
teacher than do the flickering signs of at- 
tention. : 

The concern of the present study is 
whether these fleeting cues are related to 
more enduring characteristics of students. 
Is it possible, for example, that attention 
to specific tasks presages both a positive 
orientation toward school and academic 
gains? Perhaps the behaviors that serve 
as cues of attention are also conducive to 
the development of satisfaction with school 
and academic performance. Partial sup- 
port for such a possibility comes from 4 
study of classroom behavior in an Air 
Force school. Among the findings was 4 
correlation of —.58 between achievement 
and student behavior indicating inatten- 
tion (Morsh, 1956). Pursuing this line of 
inquiry, the present study examines 
whether attention is related to attitude 
toward school and the teacher on the one 
hand, and to academic achievement and 
ability, on the other. 


Metnop 


The classroom behavior of pupils was observed 
over a 3-month period, questionnaires were aa 
istered to the pupils, and such background intr 
tion as IQ and achievement-test scores was © 
tained from school records. 
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Subjects 


The Ss were 125 pupils (62 boys and 63 girls) 
enrolled in four sixth-grade classrooms located in 
a predominantly white, working-class suburb. The 
pupils’ mean IQ as measured by the Kuhlmann- 
Anderson Intelligence Test was 104 for the boys 
and 110 for the girls. The standard deviations were 
145 and 13.4 for boys and girls, respectively. Two 
of the classes, containing 33 and 34 pupils and 
taught by men, were in one school; the other two, 
each containing 29 pupils and taught by women, 
were in another school. 


Observations 


The visits, which ranged from ¥ hour to a full 
day, began in late September and continued 
through November. During each visit periodic 
tallies of pupil attention were made along with 
other observations which are not relevant to this 
paper. As far as was possible the observations were 
distributed over the entire school week and they 
sampled most of the activities in each room. The 
total hours of observation was 37, or approximately 
9 hours in each of the four classrooms. 

Using a modified version of the Jackson-Hudgins 
Observation Schedule (Jackson & Hudgins, 1965), 
the observer looked at each pupil in turn and im- 
mediately recorded the state of his attention.’ 
Four classifications were possible: 

1. “+” if the pupil was attentive. The pupil had 
to be attending to the area of focus, namely, the 
subject to which the teacher had called attention, 
for example, arithmetic, social studies, or art. The 
pupil also had to be attending to the prescribed 
activity, that is, the activity designated by the 
teacher, such as writing in an arithmetic workbook 
or reading in a social studies textbook. 

2. “—” if the pupil was clearly inattentive. The 
pupil was marked inattentive if he were not at- 
tending to the area of focus and/or the prescribed 
activity. This classification included instances of 
horseplay, reading a book when writing had been 
prescribed, and doodling when attention should be 
focused on the blackboard. 

8. “2” if it was uncertain to the observer whether 
or not the pupil was attentive. 

4. “0” if the pupil’s attention was not ob- 
servable. 

Interobserver reliability, defined as percentages, 
Tanged from 83% to 100% in trial observations. 
Additional evidence of the reliability of the 
method has been reported by other investigators. 

or example, interobserver reliability, defined as 
Percentages of agreement, ranged from 85% to 
100%, with a median of 90% for a series of ob- 
Servations (Hudgins, 1967). 
—_ 

*A description of the conventions set up for 
Tecording attention is included in the author's final 
Teport, “Adaptation to School Settings: A Study 
7 Children’s Attitudes and Classroom Behavior,” 

Vashington, D.C.: Educational Research Informa- 
ton Center, 1967, 


Questionnaires 

: Student Opinion Poll II. The children’s at- 
titudes toward school were measured by the Stu- 
dent Opinion Poll II. This is a 47 multiple-choice- 
item test derived from an original 60-item test 
(Jackson & Getzels, 1959): The questions concern 
four aspects of school life, namely, the curriculum, 
the teacher, the peers, and the school. The follow- 
ing are sample items: 


6. The things I am asked to study are of: 

a. great interest to me 

b. average interest to me 

c, little interest to me 

d. no interest to me 

47, In general, my feelings toward school are: 

a. very favorable—I like it as it is 

b. somewhat favorable—I would like a few 
changes 

¢. somewhat unfavorable—I would like 
many changes 

d, very unfavorable—I frequently feel that 
school is pretty much a waste of time. 


The test was scored by giving one point each 
time the student chose, from a set of multiple- 
choices, the response indicating the highest degree 
of satisfaction with that aspect of school life 
under question. Thus the possible range of scores 
was from 0 to 47. The mean scores were 28 for 
the boys and 31 for the girls. The standard devia- 
tions were 8.9 and 7.2 for boys and girls, respec- 
tively. The test was readministered to 63 pupils 
after a 5-month interval; the rank correlation 
coefficient between the pupils’ two scores was .66. 
The coefficient of reliability, based on the Kuder- 
Richardson formula 20, was 89 for the boys, and 
‘85 for the girls. In an earlier study, involving 293 
sixth graders, the test reliability was 86. ; 

The Michigan Student Questionnaire (abbrevi- 
ated version). An abbreviated version of the 
Michigan Student Questionnaire (Flanders, 1965) 
assessed the students’ attitude toward their present 
teacher and schoolwork. The form used in this 
study contained 37 descriptive statements, each 
followed by four possible replies: strongly disagree, 
disagree, agree, and strongly agree. A students 
response to each item was scored 1, 2, 3, or 4 
depending on the degree to which it revealed a 
positive attitude toward his teacher. Hence, the 
possible range of scores was from 387 to 148. The 
mean scores for the sample of sixth graders were 
110 for the boys and 114 for the girls. The standard 
deviations were 14.3 and 112 for boys and girls, 
respectively. Test reliability based on a variation 
of the Kuder-Richardson formula appropriate for 


‘weighted scores (Ferguson, 1951) was .94 in a study 


jnvolving 293 sixth graders. The following are 
sample items: 


16. This teacher certainly knows 


how to teach. 
Strongly dis- Disagree Agree Strongly 
agree agree 
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23. I really like this class. 


Strongly dis- Disagree Agree Strongly 
agree agree 
Achievement and IQ 


The achievement test scores were derived from: 
(a) the Scott-Foresman Basic Reading Test to 
accompany The New People and Progress; and (b) 
the Stanford Achievement Test (Intermediate II, 
complete battery). The intelligence quotient was 
taken from the Kuhlmann-Anderson Intelligence 
Test. 


TABLE 1 


CORRELATION BETWEEN ATTENTION AND 
SrupEnts’ ATTITUDES 


Attention Student ere Michigan Student 

Boys* | Girls? | Boys® | Girls? 

Attentive 12 | —.13 -02 | —.09 

Inattentive —.07 -10 -00 -03 

Uncertain — .08 -10 | —.02 pul 

Nonobservable =.16 -19 | —.09 +22 
*N = 62. 
oN = 63. 
°N = 61. 
aN = 68. 

ReEsuirs 


_ The most noteworthy finding in Table 1 
is an overall lack of relation between stu- 
dent attitudes and attention. For neither 
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boys nor girls were feelings toward the 
school and the teacher related to their at- 
tention to the dominant class activity, 

The correlations in Table 2 between at- 
tention and measures of achievement sup- 
port what seems self-evident—the pupil 
who paid attention gained the most from 
his instruction. Or, conversely, the data 
might be said to show that the pupil who 
was inattentive was not apt to achieve 
academically. This finding confirms that of 
the study cited earlier in which Air Force 
students’ inattention was negatively as- 
sociated to achievement (Morsh, 1956). 

Table 3 shows a relation between at- 
tention and IQ.4 The brighter the pupil, 
the more he was likely to be attentive in 
class. This raises the obvious question of 
whether attention makes a unique contri- 
bution to achievement or whether its ef- 
fect is due solely to its linkage to 1Q, A 


*A check was made to ascertain whether the 
measure of attention had been contaminated by 
observer bias. Conceivably, the observer grad- 
ually might have acquired knowledge of each 
pupil’s ability and eventually she might have 
judged each pupil on the basis of his ability rather 
than his behavior. If such a bias existed, the cor- 


relation between attention and IQ would have 


increased systematically with each successive 
period of observation. This possibility was tested 
by plotting IQ scores against attention measures 
during the first third of the observations, the sec- 
ond, and, finally, the last third of the observations. 
No systematic bias was apparent. 


CoRRELATIONS BETWEEN ATTENTION AND MuasunEs oF ACHIEVEMENT 


Achievement 


.3ge | .53e* | age | ages | .37e* 
—.44** | — 5oe* | — goe* | — 47** | —.38** 
—.24 | —.36%* | —.37** | —.34e* | —.31** 

05 | —.06 7 | —.03 Bel 


TABLE 2 
Attention 
Attentive .51** -49** .46** 
Inattentive — -AT** | — 53** | — 4oee 
Uncertain — -28* | —.33** | — 37+ 
Nonobservable —.23 07 —.08 
*N = 61. 
oN = 63. 
°N = 56. 
aN = 55. 
*p < 05. 
*n < 01. 


ee 


ArrrrupinaL AND InTeLiectuaL Cogretates or ATTENTION 


regression analysis focusing on the partial 
contribution of each variable to achieve- 
ment was performed and revealed some 
evidence of the singular effect of attention 
on achievement. For boys the partial cor- 
relation coefficient between achievement 
and attention, with IQ held constant, was 
31 (p < .05) with the Scott-Foresman 
Reading Test, and .26 (p < .05) with the 
Stanford Arithmetic Achievement Test. 
For girls, however, a statistically signifi- 
cant result was obtained with only the 
Scott-Foresman Reading Test. The partial 
correlation coefficient between that test 
and attention, with IQ held constant, was 
26 (p < .05). Apparently, attention makes 
a difference with respect to certain types 
of achievement but not others. More im- 
portant is the question of whether it is 


; proper to search for the effect of attention 
_ independent of IQ. Maybe the ability to 


attend is an integral part of intelligent 
performance and contributes as much to a 
child’s performance on an IQ test as to 
his achievement in school. 

Finally, Table 4 shows low correlations 
between students’ attitudes and their 
achievement-test scores and IQ, The lack 
of a connection between attitude toward 
school’ and scholastic performance con- 
firms the findings of prior studies (Jack- 
son & Lahaderne, 1967). Despite the re- 
peated absence of ties between the attitude 


' Scores and the measures of scholastic per- 


formance and pupil attention, the Student 
Opinion Poll II and the Michigan Student 
Questionnaire correlate significantly with 


, each other (r = .63; p < .001) and with 


é TABLE 3 
ORRELATIONS BETWEEN ATTENTION AND IQ 


1Q 
Attention 
Boys* Girls? 
peach is 
» Attentive 4% 44e* 
Thattentive 23 ae —46** 
Uncertain —"49** __33** 
Nonobservable —.20 .07 
“WN = 61, 
>N = 63. 
“p< 05. 


“p< 01. 
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TABLE 4 


CoRRELATION BETWEEN STUDENTS’ ATTITUDES 
AND Measures oF ScHOLAsTIC PERFORMANCE 


Student | Michigan 


ppion Student. 
Measures oll IL Questionnaire 
Boys} Girls | Boys | Girls 
ate AT +05 +01 | —,01 
ford-Reading? 116 | —110 | ‘08 | —.12 
Stanford-Arit » 16 | 08 | 01 | 02 
Stanford-Language' 07 | —.08 | —.05 | —.07 
iQ* 15 10 08 | —.06 


® N = 61 boys and 63 girls. 
BAY = ob Bove and 88 eile 


measures of other variables. The Student 
Opinion Poll II, for example, correlates 
significantly with the Children’s Intellec- 
tual Achievement Responsibility Ques- 
tionnaire (V = 292; r = .41; p < .001), 
the Children’s Social Desirability Ques- 
tionnaire (V = 125; r = .88; p < .001), 
and teachers’ ratings of their pupils’ at- 
titude toward school (NV = 292; r = .85; 
p< .001). 


Discussion 


The real problem posed by the results 
of this study concerns the lack of a con- 
nection between the way students felt 
about school and their attentiveness in 
class. Why, for example, did students who 
were dissatisfied with school appear to be 
just as attentive as those who were satis- 
fied? What has happened to the popular 
stereotype of the daydreaming malcon- 
tent? 
Perhaps the constraints imposed on 
pupils to be attentive were 80 strong that 
attitudes could not influence behavior. 
Consider, for example, the following re- 
strictions. Pupils could not leave the class- 
room, or for that matter, get up from their 
desks without permission. They could not 
chatter with their neighbors. They had to 
be recognized before speaking up in class. 
Their actions at any given moment had to 
be within the sphere prescribed by the 
teacher. Moreover, one of the teacher’s 
major functions was to preserve the class- 
room order. She called on the reluctant, 
snapped the daydreamer back to atten- 
tion, reprimanded the cutup, and often 
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reminded the pupils of the designated focus 
of attention. In short, pupils were coaxed 
and compelled to adhere to a code of con- 
duct that supported the order of the class- 
room. Thus, regardless of how he felt 
about school, the disgruntled pupil had 
little chance to do anything about it in 
the classroom. 

It is evident that the forces for atten- 
tion impinged on everyone. Less apparent 
are the variables that accounted for 
fluctuations of attention. The data in 
Table 3 suggested the possibility that 
ability to attend may be an integral part 
of intelligent behavior. If this were in- 
deed the case, the less able pupils may 
have been limited in their capacity to at- 
tend just as they were in their capacity to 
achieve academically. Furthermore, the 
usual classroom situation where the teacher 
directed the curriculum to what he con- 
sidered was the class average may have 
strengthened the connection between in- 
telligence and attention. The able may 
have understood and participated in the 
instructional matter but the less able 
could not keep up. This possibility implies 
that curvilinear relation may exist be- 
tween attention and level of instruction. 

As the level of instruction increases in 
difficulty from zero, attention may also 
increase to an optimal point and then de- 
crease beyond that point. Moreover, it 
seems likely that the optimal point may 
vary with ability. The apex for the high- 
ability pupils may be at a higher point 
than for low-ability pupils. According to 
this speculation, the less able pupils in 
the present study may have been inat- 
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tentive when the level of instruction wag 
beyond their optimal point, 
brighter pupils, when the level of instruc- 
tion was below their optimal point. In sum, 
all of the pupils in a classroom may have 
been subjected to the pressures for at- 
tention but the extent to which they re- 
sponded appears to have been tied to 
general ability and instructional variables 
rather than to the pupils’ attitude toward 
school. 
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ences in linguistic performance, as well as 
a other behavioral realms, suggest that 
ere may indeed be differences in sub- 
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SEMANTIC SPACE AS AN INDICATOR OF SOCIALIZATION 


KEITH A. McNEIL* 
The University of Texas, Austin 


A semantic differential was administered to 521 6th-grade 

could be classified into 4 subcultural groups ee an Pe 
sized degree of socialization. A principal components factor analysis 
yielded 6 factors having an eigenvalue above 1. Factor scores were 
computed for each S and subjected to a discriminant analysis. The 
discriminant function which accounted for the most variance pro- 
duced the same rank ordering of the subcultures as the hypothesized 
degree of socialization. The degree of socialization of an individual 
can thus be estimated from the empirical weights obtained in the 


discriminant analysis. 


Several investigators have claimed that 
the structure of semantic space is similar 
across widely different cultures (Osgood, 
Suci, & Tannenbaum, 1957; Tanaka, 
Oyama, & Osgood, 1963). Several prob- 
lems, including translation equivalence 
(Brown, 1958) and developing an adequate 
statistical technique with which to match 
factors, suggest caution in accepting the 
above conclusions. Also, only gross struc- 
tural similarities have been reported, and it 
is quite feasible that subtle semantic dif- 
ferences may be very important. For ex- 
ample, communication is sometimes hin- 
dered when the parties involved do not 
attach the exact same meaning to the words 
being used in the communication act. 

The well-documented subcultural differ- 


pal ‘meaning. Subcultural differences 
ape if not apparent in the basic 
4 ture of semantic meaning, should be 
Tek Ai at least in the magnitude on each 
the sina dimension. Evidence bearing on 
Ake ructural differences of subcultural 
Neil a has been reported elsewhere (Mc- 
Ble 67), whereas evidence pertaining to 
BS ural differences in magnitude of 
Bt meaning is presented here. 
LN rang with an increased level of 
i zation may be a change in one’s view 
i en met, This change should be 
a in one’s connotative meaning. 
, looking at one’s connotative meaning 


1 S 
Now in the Department of Guidance and Edu- 


ati E 
si oe Psychology at Southern Illinois Univer- 


may provide information as to the level 
of socialization. The present investigation 
is an attempt to predict level of socializa- 
tion through measurement of connotative 
meaning, In particular, it was expected 
that the more socialized a subculture, the 
more closely that subculture’s profile of 
semantic meaning would resemble the core 
culture’s profile of semantic meaning. 


MerHop 


Sample 


The four subcultural groups in this study com- 
municate and compete in the same culture. The 
extent to which the various subcultures have suc- 
ceeded in “attaining the core culture,” that is, in 
being socialized, was hypothesized as follows, from 
most to least: middle class whites (MCW); lower 
class whites (LCW); lower class Negroes (LCN) ; 
lower class Latin Americans (LCLA). The LCLA 
were hypothesized to be at the bottom of the 
socialization continuum because of the foreign 
language influence in their homes and also because 
of two cultural aspects. “Mafiana” is a fatalistic 
attitude expressed by taking care of today’s needs 
and not worrying about tomorrow's needs (Saun- 
ders, 1954). The LCLA’s outlook is thus different 
from that of the outlook of the core culture. 
“Bvidia” is a type of black magic wherein negative 
sanctions are applied to those who achieve some 
degree of success (Madsen, 1964). To the extent 
that evidia operates in the subculture, the aspira- 
tions of the members are restricted. 


Instrumentation 


A semantic differential was administered by 
trained graduate students to 521 sixth-grade Austin 
children. The instrument contained 12 high fre- 
quency concepts (Elephant, Army, Me, Butter, 
Fear, Clouds, Doctor, Street, Baby, Schoolwork, 
Policeman, Pain), each being rated on 20 bipolar 


adjectives. The concepts and scales were selected 
the work 


on the basis of previous research, mainly the wo 
reported by Lilly (1965). Each bipolar adjective 


constituted a 7-point scale. Scale responses were 
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TABLE 1 


Variax Loapines OBTAINED FROM THE TOTAL 
Samp.e or 521 Sunsects 


Semantic differential scales| I | IZ | Z| IV] V | VI 
Strong-Weak 74|—08| 08/—16| 14/—07 
Good-Bad 51|—28} 20} 09} 42) 04 
Usual-Unusual 04|—40} 00} 51) 32|—22 
Fast-Slow 69|—07/—05} 13] 02) 09 
Interesting-Boring 60|—37| 29) 12) 05) 02 
Not important-Im- 

portant —14) 77} 11/—O1)/—12)-15 
Uncertain-Certain |—22) 64] 15|—01|—07|—27 
Changing-Steady 09/—04/—06)  71)—04)—13 
Small-Large —11} 15) 78) 10)/—17|—-09 
Rough-Smooth 16) 23|—32| 23)/—15|—55 
Heavy-Light 38] 08|—16|—22| 01)—61 
Old-New —17\—04| 06) 11) 17|/—72 
Moving-Still 60|—13|—16| 29|—09|—11 
Simple-Complex 20) 24) 17) 50) 06) 28 
Dirty-Clean —15| 38} 23} 16|—40)—50 
Not active-Active |—08) 75) 11) 04/ 06) 04 
Soft-Hard 25) 14) 71)/—13) 08] 25 
Near-Far 17; +06|—09) 08) 71|—19 
Dangerous-Safe 15) 27) 01} 03|—60)—40 
Controlled-Not con- 

trolled 12|—56| 17) 15) 36|/—07 


Note——Decimal points have been omitted. 


summed across the 12 concepts, and correlated to 
yield a 20 X 20 correlation matrix. The resultant 
correlation matrix was factored (with unities in the 
diagonal) by the principal axes method, with 
Varimax rotation of factors whose eigenvalues 

. exceeded 1,0 (Veldman, 1967). Six factors resulted 
and factor scores were computed on the six fac- 
tors for Av S. We thus have a measure of each 
S's semantic meaning on each of the important 
aimensions of suman meaning. 

e Ss were then grouped into their respectiv: 
subcultural classifications and a multiple elisccties, 
nant analysis (Veldman, 1967) was performed on 
the four subcultures, considering the six factor 
scores as the variables on which the discrimination 
was to be made. 


Resuits 


Table 1 contains the factor loadings on 
the ‘six obtained semantic factors. In- 
terpretation of the factors is not crucial to 
the present study, so just a brief listing of 
the factor names seems sufficient. Factor 
I—Evaluation; Factor Il—Activity; Fac- 
tor III—Potency; Factor IV—Stability; 
Factor V—Security; Factor VI—Un- 
named, 
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TABLE 2 
Suscunrurat Facror Scorz Muans 


Lower 
Factor | shite | white | Negro | Ameri-| 7* | ? 
IN = 134|N =146|N = 83] can 
IN = 158 
Evalua- 
tion 84) 12] —.37) —.20)12.36)<.0001 
Activity | .17| 09] .22| —.34| 9.64|/<.0001 
Potency | —.01} .16) —.32) .03) 4.13)<.01 
Stability; —.13) —.06} .31) —.00) 3.61)<.02 
Security | —.29} —.01] .08) .21) 6.38)<.001 
Un- 
named 16) .14| .00| —.27| 6.04/<,001 


Note—Factor scores based on the factor analy- 
sis of the total sample. 
* df within Ss = 517; df between Ss = 3. 4 


Table 2 indicates that there are sig- 
nificant differences between the factor 
scores of the four subcultures on each of 
the six semantic factors. Consistent trends 
cannot be discerned, nor does an overall 
interpretation seem to be possible with | 
these univariate analyses. 

Table 3 indicates that there are two 
orthogonal ways of combining the factor 
scores so as to separate significantly the 
four subcultures, The first discriminant 
function (Table 4) is of most interest, 
though, as it accounts for the most varl- 
ance. The first function produces group 
centroids (group means of the discriminant 
scores) which are in the same relative or- 
der as the hypothesized degree of socializa- 


TABLE 3 


Discrmanant Anatysis Resuurs: GROUP 
Crntrows on THE THREE DiscRIMINANT 
‘ Foncrions § 


Discriminant functions — 
Subcultures Se EEE SEE El 
1 2 3 
Middle class white 51 | —.02 |—.11 
Lower class white 20 | —.07 | 15 
Lower class Negro —.27 -56 00 
Lower class Latin 
American —.4g | —.21 |—.04 
Variance 70.03%| 26.05%| 3.9270 
x 89.435*| 34.69>*| 5.38° 
sdf = 8. 
bdf = 6. 
edf = 4, 


*p < 0001. 


TABLE 4 
DiscrimINANT WEIGHTS FOR THE First Function 


Factor Weight 
Evaluation -606 
Activity 429 
Potency -108 
Stability —.218 
Security — 454 
Unnamed 428 


Note.—Wilk’s lambda = .780; F = 7.407, df = 
18/1,449, p <.0001. 


tion. The second discriminant function 
‘also produces a significant difference be- 
“tween the four subgroups, and it should be 
“noted that the differences between the two 
nonwhite subcultures are responsible for 
‘the definition of the two ends of the dis- 
‘criminant function. Interestingly, the third 
discriminant function (nonsignificant) was 
defined by the two white subcultures. 


Discussion 


_ The results of the discriminant analysis 
indicate that the four subcultures can be 
Separated. significantly, when scores on 
semantic factors are taken into considera- 
tion. What is of more importance is the 
finding that the relative order of the sub- 
cultures on the discriminant function 
which accounts for the most variance is 
the same as the hypothesized degree of 
‘Socialization, Applying the obtained dis- 
criminant weights (Table 4) to semantic 
factor scores of a new S, would predict his 
Position on the first discriminant function 
(or estimate his degree of socialization). 
After the majority of the variance is 
extracted from the data by the first dis- 
criminant function, the underlying function 
“Which extracts the most amount of vari- 
ance is due to the semantic differences be- 
tween the two nonwhite subcultures. After 
“these two sources of variance are ac- 
‘Counted for, remaining variance is ac- 
counted for by the two white subcultures. 


Semantic Space as AN Inpicator oF SoctaLizaTIon 


sar 


Thus, the most important underlying di- 
mension accounting for the subcultural dif- 
ferences in semantic meaning is not due to 
ethnicity or social class, but to the degree 
of socialization. 

The extent to which the obtained re- 
sults are a consequence of subcultural dif- 
ferences unrelated to socialization needs to 
be verified. That is, the results need to be 
cross-validated on a sample of Ss who cam 
be classified into groups of varying degrees 
of socialization, on the basis of a so- 
cialization variable. 

It is further suggested that the semantic 
differential technique would be a valuable 
instrument to measure socialization be- 
cause of the reduction of the effects of 
response sets. That is, the instrument does 
not appear to S to be measuring socializa- 
tion. In fact, the average S is unable to 
determine how the instrument is being 
used. Also, the instrument can yield ad~ 
ditional information, besides a socializa- 
tion score, which could be valuable. 
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SELECTION OF DEFINING PROPERTIES IN 
CONCEPT ATTAINMENT 


D. CECIL CLARK* anD FREDERICK J. McDONALD 
Stanford University 


The present study explored the question, When an $s is presented 
with several examples of a concept, what determines which of the 
common properties he will select as defining the concept? 3 com- 
peting hypotheses were tested by asking 5th graders to select a defining 
property when presented with verbal examples of a concept. Results 
showed that (a) the common property with the highest total associa- 
tion strength will be selected as defining with significantly greater fre- 
quency than other common properties, and (b) neither the order 
in which the examples are presented nor the presence of an example 
with a strongly associated property appears to influence the selection 


of the defining property. 


Many of the concepts taught in the ele- 
mentary classroom are of the type that 
have clearly differentiated properties 
which must be accurately learned by all of 
the students. In teaching for this type of 
concept, the teacher usually presents sev- 
eral examples, pointing out the defining 
properties common to each. Thus, in teach- 
ing for the concept “tree,” she may show a 
picture of a maple tree, a picture of a cot- 
tonwood tree, and a picture of a douglas 
fir tree, indicating that, among other pro- 
perties, each has a crown, a trunk, and 
roots (defining properties, see A, Table 
1). In testing to determine whether the 
class has mastered the concept, she ob- 
serves that some of the students correctly 
identify new pictures of trees and nontrees, 
while others do not. One explanation for 
this phenomenon is that students differ in 
the common properties they select to de- 
fine the concept. One student might cor- 
rectly select crown, trunk, and roots as 
defining properties while another student, 
from the same examples, might incorrectly 
select color, trunk, and size as defining 
the concept. 

Such differences give rise to a basic ques- 
tion about the process of forming concepts: 
When presented with several examples of a 
concept, what determines which of the com- 
mon properties will be selected as defining 
the concept? The answer to this question 


* Now Associate Professor, Department of Edu- 
cational Psychology, College of Education, Uni- 
versity of Washington. 


has implications for the optimal type and 
sequence of examples that can be used to 
illustrate a given concept. 


BACKGROUND 


In a typical concept presentation the 
teacher may offer both pictorial and verbal 
examples to illustrate a given concept (A, 
Table 1). This study was limited to the 
selection of defining properties when only 
verbal examples were used (B, Table 1). 
Thus, properties such as color, size, and 
shape which are often significant, with pic- 
torial examples were not considered in the 
present study. 

Underwood and Richardson (1956a) 
have developed a set of adult word-associa- 
tion norms, When presented with a noun, 
Ss were asked to respond with any sensory 
adjective that came to mind. Responses to 
each noun as well as their frequencies of 
occurrence were tabulated. For example, 
61% of the Ss responded to the noun button 
with the sensory adjective “round,” 15% 
with “small,” 5% with “hard,” and 5% 
with “white.” According to their norms, 
the same adjective was often given to dif- 
ferent nouns. For instance, “round, 
“hard,” “small,” and “white” were like- 
wise common adjectives for baseball and 
hailstone (B, Table 1). Associative strength 
represented the percentage of Ss giving 4 
particular adjective to a particular noun. 
Thus, the associative strength between but- 
ton and “round” was 61%, and between 
button and “small,” 15%. 
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TABLE 1 
Taxonomy oF A CLAssRooM-TyPE ConcEPT AND 
AN ExpERIMENTAL ConcePr 


Example Common properties | Concept 


Classroom (A) 


Picture of maple tree | Crown? 
Picture of cottonwood | Trunk> 

tree* Roots Tree 
Picture of douglas fir | Brownish 

tree* Large branches 

Experimental (B) 
Button Routa 
Hard 

Baseball Small x 
Hailstone White 


. Accompanied by explanation. 
» Arbitrarily selected as defining properties. 


In the present study, the nouns button, 
baseball, and hailstone were viewed as 
three examples of some new concept, “X,” 
and “round,” “small,” “hard,” and “white” 
represented properties common to the three 
examples (B, Table 1). The basic ques- 
tion, then, was what would cause one of 
these common properties to be selected 
ae the others as, defining the concept, 


HYPOTHESES 


Three alternative explanations provided 
the bases for the hypotheses tested in this 
study. One explanation was that as 8 

thinks about the properties common to the 
three examples, a combining operation oc- 
curs. The three associative strengths of @ 
Property are added together; this occurs 
with the other three properties as well. 
= Table 2 for instance, “round” = 61 + 
a + 14 = 145.) The first hypothesis was 
at the property with the highest total as- 
oe strength would be selected as 
efining concept, “X,” with greater fre- 
Quency than the other properties. (This 
study was limited to only one defining prop- 
ay) Thus, in Table 2, since “round” 
ir 8 higher total associative strength than 
© other properties, it would be selected in 
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defining concept, “X.” Results of Under- 
wood and Richardson (1956a), Schulz, 
Miller, and Radtke (1962), and Coleman 
(1964) suggest the plausibility of this hy- 
pothesis. 

A second competing explanation was 
that the strength of the association be- 
tween one of the examples and one of its 
properties would be so strong that other 
properties of this example—and dissimilar 
properties of other examples—would be- 
come less obvious. The effect of this strong 
association in one example would be that 
of highlighting the same property in other 
examples. The second hypothesis was that 
the common property with the single great- 
est associative strength would be selected 
as defining with greater frequency than the 
other properties. Thus, in Table 2, the as- 
sociation between baseball and “round” 
is so strong (70) that “round” would be- 
come the vivid property of both button and 
hailstone. As a result of this spread of ef- 
fect, “round” would become the obvious 
property to define concept, “X.” Investiga- 
tions by Freedman and Mednick (1958), 
Underwood and Richardson (1956b), and 
Wicklund, Palermo, and Jenkins (1964) 
suggest the feasibility of this hypothesis. 

A third competing explanation was that 


TABLE 2 
EXPLANATION FOR THE SELECTION OF PROPERTIES 
Derinine Concert ‘‘X’? 


Common property | Associative strength 


Example 
Pein a ae ea 
Round 61 
Small 15 
Button Hard 5 
White 5 
Ficen arse rye) p Rabi sense 
Round 0 
White 
Baseball Hard 10 
Small 5 
Sgr ga! soyap ye te Be Se 
Hard 49 
Round 14 
Hailstone White 9 
Small 7 


Note.—Total associative strength for round = 
145, aed = 64, small = 27, white = 25. “Round”? 
js selected as the property to define concept, 
“X”, according to each of the three hypotheses. 


TABLE 3 
Intustrativs Sers or Exampius Dusitcnep To Test THE Torex Hyrporueszs 3 


E 7 ane associative 
Baton 
1, Total strongest Saucer 
erty Round = 123 
Thin 
Button 
Pot 
2. Single strongest prop- Armor Shiny = 
Hard 
Badge 
A. Button 
B. Thimble 12 
10 Round = 
Hard 
C. Stone Hard 72 
sir Round 9 
8. First property of se- 
C. Stone Hard 72 
Round 9 
B. Thimble Round 12 Round = 
10 Hard 
A. Button Round 68 
Hard 3 


would determine which properties would be 
the order in which the examples appeared 


ample $ saw would be selected as defining 
with greater frequency than the strongest 
property of succeeding examples. Thus, in 
Table 2, since S would most likely see but- 
ton first and since its strongest property is 
“round” (associative strength = 61), the 
latter would be verified as a property of 
baseball and hailstone, and then be given 
as defining concept “X.” (Experimentally, 
“round” would never be the strongest prop- 
erty of any of the other examples.) 
Studies by Cohen and Musgrove (1964), 


D. Crom, Crarx anp Faepertck J. McDonatp 


Coleman (1963), Crouse and Dune 
(1963), and Freedman and Mednick (19 
indicate the reasonableness of this h 
esis, 


Meruop 4 


One hundred and seven’ of the 213 nouns (@ 
amples) in the Underwood and Richardson 
were presented, one at a time, to 232 fifth- 
children in the Palo Alto Unified School D 
to determine what properties children w 


*Tf adult norms had been used in the press 
study, combinations of these 107 examples Wow 
have provided optimal testing of the three hyp 
eses. It was assumed, that these same e% 
would generally provide optimal examples 
the children’s norms had been established. 
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sociate with each example as well as their re- 

tive associative strengths, Hach example was 
resented visually for 10-15 seconds and Ss were 
asked to think of another word which best de- 
wribed the word in front of them. “Make sure 
the word you think of tells how something smells, 
or how it tastes, or what size it is, or something 
like that.” Several words were presented as ex- 
amples and various appropriate and inappropriate 
response words were discussed until 8s fully under- 
stood their task, All properties associated with 
each example as well as their frequencies of oc- 
currence were tabulated, Thus, “round,” 68%; and 
“small,” 17% (miscellaneous, 15%) were the two 
major properties associated with the example but- 
ton. Since 68% of the students responded with 
“round,” it represented the stronger of the two 
properties, 

Examples with at least two common properties 
were then grouped into sets with three examples 
per set to test one of the three hypotheses (Ta- 
ble 8). To qualify for testing the total-strongest- 
property hypothesis, a set had to be arranged as 
follows: The first example had about equal as- 
sociative strengths for both properties (to control 
for a first-property effect), and the total strength 
for the two properties differed maximally (Hy- 
pothesis 1, Table 3), 

To qualify for testing the single-strongest- 
property hypothesis, a set had to be arranged as 
follows: The first example had nearly equal as- 
sociative strengths for both properties (control for 
4 first-property effect), and the total ‘were 
nearly equal (control for total-strongest-property 
effect), The second and third examples in the set 
each had properties which differed in their 
strengths to allow one of the examples to have a 
very strong property. Thus, in Hypothesis 2, Ta- 
ble 3, “hard” is a relatively strong property of 
armor (31%). 

To qualify for testing the first-property hy- 
pothesis, a set had to be arranged as follows: The 
first example had a high strength for one of its 
Properties and a low strength for the other; the 
second example had about equal strengths for 

properties, and the Jast example had a high 
strength for one property and a low strength for 
the other (which was a reverse of the first ex- 
ample; see Hypothesis 3, Table 3). If the set 
Were presented in the sequence A, B, ©, then one 
Property (round) should be selected; if however, 
the set was presented in the sequence C, B, A, 
the other property (hard) should be 

ce the single strongest property 

strongest property were held constant across #e- 
tole confounding effects were 


E 


: 


trolled, 

Of the 39 sets which qualified for testing Hy- 
Pothesis 1, the 20 most were selected. 
Of the 43 sets which qualified for testing Hypothe- 
sis 2, 16 of the strongest were selected, and 0 
du 33 for Hypothesis 3, 17 were selected and 

iplicated with the exception that the first and 
third examples were reversed. The total number 


= 
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of sets of examples designed to test the three hy- 
potheses was 53, 

Each of 222 different fifth graders in the same 
school district was presented with one-half the 
sets designed to test each of the three hypotheses. 
Each experimental 8 received 10 for Hypothesis 
1, 8 for 2, and 17 for 3, totalling 35 sets. t 
different orders resulted from all possible combi- 
nations of half the sets for each hypothesis; these 
orders were randomly assigned to the 222 Ss. Each 
set (three examples) was printed on a separate 
card and presented one at a time for 10-15 sec- 
onds. Directions were similar to those given when 
one example was presented with the exception 
that, in this situation, Ss were to think of a prop- 
erty describing all three examples on the card, 


Rusvuts 


The mean number of Ss selecting a prop- 
erty from each set of examples was 91.7. 
Baton, saucer, and button was one of the 
sets designed to test the total-strongest- 
property hypothesis. Of Ss responding 
to this set, 95.7% selected the property 
“round” (predicted property, see Hypothe- 
sis 1, Table 3) and 4.3% selected the 
property “thin.” These percentages were 
placed in a 2 x 2 table as observed per- 
centages along with expected percentages 
of 50% “round” and 50% “thin,” and a chi 
square was computed, This same pro- 
cedure was followed for each set designed 
to test Hypotheses 1 and 2. Chi squares 
for Hypothesis 3 were slightly different. 
Frequencies rather than percentages were 
used, and they occurred for sequence A, B, 
C and for sequence C, B, A, rather than as 
observed and expected frequencies. 

For each set testing each hypothesis, two 
questions were important: (a) Did the 
greatest percentage of Ss select the pre- 
dicted property? (b) Were the differences 
in percentages significant? (Hypotheses 1 

Table 3). 
ae aee. 1 predicted that, in each of 
the 20 sets, the total strongest property 
would be selected by the greatest per- 
centage of Ss. Table 4 shows that, for 18 of 
the 20 sets, the greatest, percentage of Se 

i property; in 17 of 


centage 
percen selecting 
— was titicant (for 15 of the 17, 


pera Clearly, Hypothesis 1 was sup- 
ported. 


a 


TABLE 4 
Summary or Resvurts 


Item 


Number of sets testing hy- 
pothesis 
Greatest percentage select- 
ing predicted property 
Number of chi squares 
p < 05 


ns 
Greatest percentage select- 
ing opposite property 
Number of chi squares 
p< 05 
ns 


Hypothesis 2 predicted that, in each of 
the 16 sets, the single strongest property 
would be selected by the greatest percent- 
age of Ss. Results showed that in only 7 of 
the 16 was the predicted property selected 
by the greatest percentage of Ss, and in 6 
of these the differences were not significant. 
Further, in 9 sets, the opposite property 
was selected by the greatest percentage of 
Ss. Hypothesis 2 was not supported. 

Hypothesis 3 predicted that (a) the 
strongest property of the first example 
would be selected by the greatest percent- 
age of students, and (b) when the first 
example was replaced by a different exam- 
ple whose strongest property differed from 
the first, the greatest percentage of Ss 
would select the “new” strongest property. 
Results showed that this occurred in only 
8 of the 17 situations; in 7 of the 8, the 
differences in frequencies were not signifi- 
cant. In the remaining 9 situations, little 
or no change occurred, Hypothesis 3 was 
not supported. 


Discussion 


A conclusion that can be drawn from the 
results is that when one of two properties 
is to be selected as defining some concept 
for which S is given verbal examples, the 
property with the highest total associative 
strength will be selected. Neither the order 
in which the examples are presented nor the 
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presence of an example with a strongly as. 
sociated property appears to influence the 
selection of the defining property. 

Several studies (Crouse & Duncan, 
1963; Freedman & Mednick, 1958; Judson 
& Cofer, 1956) have shown that the order 
in which verbal examples appear influences 
selection performance on verbal concept 
tasks. These studies used adult Ss. The 
present study, using children as Ss, failed 
to obtain similar results. One possible ex- 
planation for this inconsistency is that age, 
in part, determines when the order of ex- 
ample presentation will affect the selection 
of defining properties. Such an explanation 
would be more convincing if the examples 
and selection tasks had been similar over 
the previous experiments. In one study, 
the concept examples were names of proc- 
esses; in the other two, they were names of 
objects. In one study, the selection task 
was that of sorting cards after being pre- 
sented with a positive example; in another, 
the task was that of discarding one of four 
words which did not go with the other 
three. In the third study, the task was iden- 
tifying examples of several different con- 
cepts presented in a serial order. Given this 
variety of selection tasks and the differing 
nature of the examples, conflicting re- 
sults about the effects of example sequence 
are probably more attributable to example 
and task differences than to age differences. 

This same explanation may also account 
for the conflicting results on the effects of & 
single strong property on the selection of 
defining properties. The study by Freed- 
man and Mednick (1958) differs from the 
present study in the nature of the selection 
task and the methods of example presenta- 
tion. q 

The present study dealt with a simple 
concept task. Only three examples were 
used from which to select defining proper 
ties. Furthermore, the study was designe 
so that only one property could be selecte 
and only two were common to each ea 
ple. A next logical step would be to selec 
concepts more similar to those presented in 
the classroom which characterize a wider 
variety of examples and a greater number 
of common properties. 
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Data from 2 separate studies indicate a significant loss in positive atti- 
tudes of pupils toward their teachers and schoolwork during the 
school year. In the present study of 6th-grade pupils in 30 class- 
rooms it was shown that this erosion of positive attitudes is not re- 
lated to pupils’ IQ, socioeconomic status, or percentage of A and B 
letter grades assigned by the teacher, but is related to the “external- 
ity” or “internality” of the pupils and to the teachers’ verbal class- 
room behavior. Greater losses in attitudes occurred among external 
than among internal pupils and among pupils whose teachers ex- 
hibited a lower incidence of praise and encouragement than among 
those whose teachers exhibited a higher incidence of such behaviors. 


Are pupils most optimistic about school- 
work as the school year begins? Does this 
optimism erode as the school year pro- 
gresses? Are there particular patterns of 
teacher behavior which appear when there 
is less erosion of optimistic pupil at- 
titudes? This article will attempt to an- 
swer these questions, not with complete, 
unequivocal answers, but with some sug- 
gestions based on two separate studies, 

In a 1960-61 Minnesota study (Flan- 
ders, 1963), fairly conclusive evidence was 
collected indicating that over 3,000 stu- 
dents in two junior high schools scored 
highest on an attitude inventory assessing 
positive perceptions of their teachers and 
their schoolwork in October, only to have a 
statistically significant decrease in the 
scores of a January readministration of 
the same inventory, A follow-up adminis- 
tration was about the same as January, 
significantly lower than the October scores. 

The 1960 attitude inventory consisted 
of 59 items roughly divided into four sub- 
seales on the basis of content: (a) teacher 
attractiveness, which included such items 
as, “I would like to have this same teacher 
next year,” and, “This is the best teacher 
I ever had,” (b) fairness of rewards and 


* This article is based on research supported by 
the United States Office of Education. The first 
author was project director; the second author was 
in charge of statistical analysis; and B, Leland 
Brode was in charge of data collection. 


punishments, which included such items as, 
“This teacher punishes me for things I 
didn’t do,” and, “This teacher punishes the 
whole class when he (she) can’t find out 
who did something,” (c) teacher compe- 
tence, which included such items as, “Our 
teacher is very good at explaining things 
clearly,” and, “It is easy to fool this 
teacher,” and (d) interest in schoolwork, 
including such items as, “This teacher 
makes everything seem interesting and 
important,” and, “Most of us get pretty 
bored in this class.” The response to each 
item was on a 5-point scale from strongly 
disagree to strongly agree. All items were 
keyed so that a higher’ score represented 
more positive attitudes and perceptions, _ 

The mean score of the October adminis- 
tration of the attitude inventory was 217. 
The means of the January and May ad- 
ministrations were 204 and 205, respec: 
tively—both significantly lower (p < .01) 
than the October administration. These 
data were collected in so-called academic 
classes in Grades 7, 8, and 9, excluding 
such subjects as physical education, musl¢, 
home economics, and shop. f 

The results seem quite clear. There 18 4 
significant reduction in the average scores 
of positive pupil perceptions between Octo- 
ber and January of the school year. 


Tue Present STuDY 


During the 1964-65 school year a Mish 
gan Student Questionnaire (MSQ) Ww 
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administered to 101 sixth-grade classes in 
15 school districts near Ann Arbor. Thirty 
classes were selected for further study from 
the October distribution to include the top 
10, the bottom 10, and 10 near the average 
of the 101 classes. The test was readminis- 
tered in January and again in May in 
these 30 classes, each administration in- 
volving more than 800 pupils, and the 
sample can be considered representative of 
over 3,000 pupils who were in the larger 
population. 

The MSQ was essentially the same in- 
ventory used in 1960-61 except that the 
items had been simplified in an effort to 
adjust the vocabulary to the reading skills 
of sixth-grade pupils. A factor analysis of 
the MSQ indicated that the most important 
factor was teacher attractiveness, with ad- 
ditional factors of teacher competence, 
teacher fairness, and lack of pupil anxiety 
forming a combination which was less im- 
portant than the first factor. 

The means for each of the 30 classes are 
shown in Table 1, arranged in terms of 
the top, middle, and bottom 10 classes on 
the October administration. 

The 1964-65 results were nearly identi- 
cal to the 1960-61 results. There was a 
Significant drop in average scores of posi- 
tive pupil attitudes during the first 4 
Months of the school year. The mean score 
for the October administration was 178.2 
with a standard deviation of 26.52. The 
January administration had a mean of 
172.2 and a standard deviation of 31.13; 
and the May administration had.a mean of 
170.6 and a standard deviation of 30.60. 
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The rest of this article will discuss the 
various factors that might be related to 
the observed change in attitude, one ad- 
ministration compared to the next. 


Factors Not Associated with Change in 
Attitude ' 


Simple regression is not an adequate ex- 
planation of these changes. While it is 
true that three low classes (Classes 21, 24, 
and 27) showed the highest positive 
changes, it is also true that three other 
low classes (Classes 22, 28, and 28) 
showed large decreases. The average loss 
(October to May) of the bottom 10 
classes was 6.2, compared to 7.6 for the 
total group. Furthermore, the top 10 
classes do not show uniform loss but in- 
stead are symmetrically distributed about 
an average loss of 7.5. _ 

The correlations. between administra- 
tions, based on individual scores, were 
positive and fairly high: for October to 
January, r = .704; for January to May, 
r = .812; and October to May, r = .655. 
The correlation between October and May, 
based. on 30 class averages, is higher; 
r = .876. The correlations present a picture 
of a fairly stable response pattern both 
within and between classes. , 

There is the possibility that change in 
positive pupil attitudes might be associ- 
ated with the average class 1Q, socio- 
economic status, or the percentage of A 
and B letter grades assigned to the. pupils 
by the teacher. Table 2 shows such data 
for the nine classes which had high change 
losses and the seven classes with the least 


TABLE 1 
Means or 30 CLasses oN THE 1964-65 MicHtGAN STUDENT QUESTIONNAIRE 
aif e Administration Administration 
Class Administration fae 
Oct. Jan. May 
Tatars _——————— 
1 204.9 | 194.7 | 194.2 il 180.9 | 163.7 | 156.3 21 : 2 
2 201.4 | 200.0 | 195.4 y2 | 179.2 | 180.3 | 176.2) 22 : 
3 200.3 | 204.9 | 200.0 13. | 178.8 | 173.3 | 164.5) 28 4 f 
4 199.6 | 194.7 | 192.3) 14 |176.8| 1714) 155.7) 24 ; i 
5 197.5|1952|1952| 15 | 175.2] 151.9| 155.7} 25 : i 
8 197.0 | 198.5 | 180.9] 16 | 178.8| 178.0| 176.1) 26 : i 
: 193.6 | 190.6 | 187.9 7 177.9 a : ; 
192.0 | 185. 75.9 18 176.2 if Fe : 7 i 
9 191.6 18.0 is6-5 19 | 175.4 | 169.2| 167.0] 29 | 156.6 ee ie 
10 190.3 | 178.3 | 176.3 20 | 173.8] 170.2 | 166.5} 30 | 149.9 d ; 
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TABLE 2 
ComPaRIsONS BETWEEN HiaH- anp Low- 
ArrirupE CuanGgE CiassEs 


High change Low change 
%,of Jot 
Class | 19 | SES | A&B| Ca! 19 | SES | A&B 
11 | 121.6) 81 | 75 3 | 118.1) 65 | 91 
23 | 104.2) 67 | 33 | 17 | 114.1) 78 | 55 
14 | 111.7] 71 | 64 | 25 | 104.1) 69 | 48 
15 | 118.5) 67 | 52 5 | 113.1) 70 | 66 
28 | 109.8) 74 | 41 | 16 | 120.7) 71 | 81 
8 | 116.8) 78 | 57 | 12 | 100.0) 63 | 50 
22 | 110.3} 67 | 63 | 27 | 116.1) 83 | 62 
13 | 116.0) 83 | 80 
10 | 112.7) 70 | 42 


Note.—SES = socioeconomic status. 


amount of change. The mean IQ for the 
high-change group was 113.5, while it was 
112.3 for the low-change group. Here the 
IQ scores used were those based on school 
records and probably involved different 
published tests, The median socioeconomic 
rating for the high-change group was 71; 
a median rating of 70 was obtained for 
the low-change group. Here a rating on the 
National Opinion Research Center scale 
(Reiss, 1961) was made of the wage earner’s 
occupation as reported by the teacher. The 
mean percentage of A and B letter grades 
for the high-change group was 56.5 and for 
the low-change group, 64.5. While this last 
difference is consistent with a theory that 
change involving loss of positive attitudes is 
associated with receiving lower grades, a z 
test between independent proportions 
yielded a value of 1.66 which was not high 
enough to reject the null hypothesis at the 
.05 level of significance. All of these data 
suggest that changes in class attitudes are 
not significantly associated with average 
IQ, socioeconomic status, or grades given 
by the teacher. 


Two Factors Associated with Change in 
Attitude 


In another study, Morrison (1966) has 
shown that Rotter’s notion of “externality” 
and “internality” (Rotter, Seeman, & 
Liversant, 1962) can be assessed among 
sixth-grade pupils. By externality is 
meant the tendency of a pupil to believe 
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that his successes and failures are caused 
by forces beyond his control. By internal- 
ity is meant the tendency to believe that 
successes and failures are self-determined 
and products of one’s own behavior. Ex- 
ternal children, according to Morrison's 
conception, would be more likely to as. 
sociate the good and bad outcomes of class- 
room learning activities with the teacher 
who is a powerful source of influence. In- 
ternal children, on the other hand, would 
see themselves as more closely associated 
with the good and bad characteristics of 
learning outcomes. 

A test of internality-externality was 
administered to all the pupils in the 30 
classes during the January administra- 
tion of tests. The test consisted of 26 items, 
each containing two statements, and the 
pupils responded by marking the state- 
ment in each item which they believed was 
more often true. Typical items were: (a) 
“Tf you study you will do well on a test,” 
or (6) “People who score the highest on a 
test are lucky.”—(a) “Most of the time 
children get the respect they deserve from 
others,” or (b) “Many times a child can 
try hard and no one will pay attention to 
him.”—(a) “Usually other people choose 
me for a friend,” or (b) “Usually I choose 
my own friends.”—(a) “Children get into 
trouble because their parents punish them 
too much,” or (b) “The trouble with most 
children is that their parents are too easy 
with them.” 

Each item was scored 1 if the internal 
response was chosen and 2 if the external 
response was selected, giving 4 possible 
range from 26 to 52 for the total scores. The 
actual scores ranged from 26 to 49. Stu: 
dents in the lower third of the distribution 
(raw scores of 31 and below) were Gor 
as internals and those in the upper ae 
(raw scores of 36 and above) were define 
as externals. 

In addition to these tests each of the 
30 classrooms was visited by an observer 
trained to code verbal communication into 
the 10 categories of interaction analys 
developed by Flanders (1965). More tha 
six visits were made to each class ‘ 
more than 7,000 tallies were recorded by 
observers, The main results of interactio? 
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analysis will be reported elsewhere; the 
interest for the moment is in the incidence 
of praise and encouragement expressed by 
the teachers during these visits. The oc- 
currence of this type of teacher statement 
yaried from low of .2% to a high of 2.1% 
of all tallies recorded by the observer. The 


_ problems of reliability among observers 


forse 


and the representativeness of the interac- 
tion sampled are too complex to be dis- 
cussed here, It can be said, however, 
that the relative objectivity of the obser- 
yation data, or lack of it, would affect the 
data from all classes equally and cannot 
account for any of the differences about 
to be discussed. 

It was hypothesized in this study that: 

1, External children have a greater 
negative shift in attitude than do internal 
children. 

2. The classes of low-praise teachers 
have a greater negative shift in attitude 
than do the classes of high-praise teachers. 

3. The attitudes of external children 
are more affected by the praise and en- 
couragement of the teacher than are the 
attitudes of internal children. 

To test these hypotheses a two-way 
analysis of covariance in the case of un- 
equal or disproportionate numbers of ob- 
servations in the subclasses was performed 
using the third attitude inventory scores 
4s the dependent variable and adjusting 
with the scores from the first administra- 
tion. Table 3 includes the October means, 
the May means, and May means ad- 
tes for the initial attitudes, and the 
change means. These are arranged in sub- 
Stoups of internal and external pupils and 
ee with high-praise and low-praise 

achers, 

The slope of the regression line was 


_ §M45, and the analysis of covariance yielded 


4 error mean square of 520.7. The main 
eet for internal versus external pupils 
: @ mean square of 18,331 and 
4 Tesulting F ratio is significant at well 
nt the 01 level (F = 35.20, df = 
x ). The main effect for pupils of high- 

© versus low-praise teachers produced 
cr. Square of 7,128, also resulting in 
cue ratio which is significant at well be- 

the .01 level (F = 13.69, df = 1/478). 
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TABLE 3 
Initab, Finan, anp Apsustep Means or 
ArtitupE Scores FRoM ANALYSIS OF 
Covarrance By Puri, TyPy AND 
TEACHER STYLE 


Pupils 
Teacher style 
Internal | External All 
High-praise 
Initial 190.6 171.0 | 183.9 
Final M 187.3 159.2 17.7 
Adjusted M 163.7 152.1 | 159.7 
Change M —3.3 =11.8 —6.2 
Low-praise 
Initial M 178.7 121.5 143.9 
Final M 173.1 111.3 | 135.5 
Adjusted M 159.5 146.0 151.3 
Change M —5.6 —10.2 —8.4 
Initial M 185.8 187.8 | 162.6 
Final M 181.6 127.1 155.2 
Adjusted M 162.0 148.0 - 
Change M —4.2 —10.7 =” 


However, the interaction of pupil types 
and teacher styles resulted in a mean 
square of only 90, which is not signifi- 
cant (F = .17). 

These results indicate that not only did 
external pupils have less positive attitudes 
than did internal pupils early in the school 
year, but when the May scores are adjusted 
by the October scores, it is apparent that 
external pupils experienced significantly 
greater declines in their attitudes than did 
internal pupils. Also, pupils with low- 
praise teachers showed greater losses in 
positive attitudes during the year than did 
pupils with high-praise teachers. How- 
ever, there was no evidence that the at- 
titudes of external children were more af- 
fected by praise or lack of praise on the 
part of the teacher than were the attitudes 
of the internal children. 

DIsScussIoN AND CONCLUSIONS 


In two separate projects the attitude in- 
ventory scores indicate that positive per- 
ceptions of pupils toward their teacher and 
ities decrease sometime 


their class activ: 
during the first 4 months of the school 


year. In the second project these changes 
were shown to be unrelated to 1Q, per- 
formance grades assigned by the teacher, 
and the socioeconomic ratings of the 
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father’s occupation. Two hypotheses about 
changes in attitude were supported. First, 
external pupils experience a greater loss of 
positive attitudes toward school than do 
internal pupils. Second, in classrooms of 
teachers who provide less praise and en- 
couragement there is greater loss of posi- 
tive attitudes than in classrooms of 
teachers who provide more praise and en- 
couragement. The third hypothesis was not 
supported since the interaction effects in 
Table 3 are not significant. 

One inference to be drawn is that the 
type of youngster who is more dependent 
on external influences seems to be more 
likely to suffer a loss of positive expec- 
tations than is the one who is more de- 
pendent on internal influences. In addition, 
pupil attitudes toward the teacher and 
the learning activities seem to be related to 
teacher behavior. Whether this difference 
in pupil attitudes is the result of the dif- 
ferent teacher behaviors, or the different 
amounts of teacher praise and encourage- 
ment are the result of the pupils being 
more or less deserving of that praise, is 
not clear from the evidence of the present 
study. The absence of a significant dif- 
ference between high-change and low- 
change classes with respect to the per- 
centage of A and B grades given by the 
teacher (Table 2) would indicate that the 
pupils’ performance was not the deciding 
factor. Also, previous studies (Flanders, 
1963, 1965) have indicated that teacher 
behavior is the more dominant factor and 
that differences in such patterns of teacher 
influence tend to be greater between dif- 
ferent teachers than between different sit- 
uations for the same teacher. 

Tn this sample, differences among the 
pupils had a greater effect than the pres- 
ence or absence of a small amount of 
teacher praise and encouragement. Future 
studies of the erosion of positive pupil at- 
titudes may wish to take into account 
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other differences among pupils as well ag 
differences in teacher behavior. 

This study did not provide direct eyi- 
dence concerning two opposite hypotheses 
about the erosion of positive pupil at- 
titudes. One theory is that the pupils be- 
come disenchanted with the teacher during 
the first few months of the school year, A 
second theory, based on the assumption 
that the October scores are inflated or 
too high, is that as the pupils learn to 
trust their teacher they do not overesti- 
mate their ratings as they felt compelled 
to do with a strange teacher. Without 
going into detail, the authors tend toward 
the first of these two theories, primarily 
because the teacher’s behavior is the pre 
dominant influence in the typical class- 
room; but much more evidence will be 
required before any conclusions can be 
reached. 

Meantime, lack of loss of positive at- 
titudes may be the mark of a good match 
between teaching behavior and particular 
attributes among pupils. Apparently, in 
most classrooms such a match does not 
exist. 
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IN RETARDATES AND NORMALS' 
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In 4 experiments retardate and normal Ss either listened to a peer 
learning an initial list of PAs or learned the same list directly prior 
to receiving a PA test list. The major variables studied were transfer 
paradigm (A-B, A-B; A-B, A-C; A-B, A-B’; A-B, C-D), number of 
Ist-list trials (5, 9, 15), and instructions to observers (“listen” or 
“learn”). Results indicated that for both populations: (a) vicarious 
List-1 learning facilitated transfer in the A-B, A-B paradigm, the ex- 
tent of this transfer depending upon instructional conditions for nor- 
mals but not for the mentally retarded, and (b) there was a con- 
sistent trend for performance to be better when List 1 was learned 
directly rather than vicariously. The results were taken as support 
for the notion that retardates are more outer-directed than normals 
and the failures to obtain significant transfer following vicarious 
learning in the A-C and A-B’ paradigms were attributed to low lev- 
els of initial learning. It was suggested that vicarious verbal learn- 
ing is best conceptualized within the framework of traditional PA 


SUS VICARIOUS PATRED-ASSOCIATE LEARNING 


wT transfer theory. 


_ It is now well established that a wide 
Tange of behaviors can be learned in hu- 
tan adults and children by vicarious 
m ‘Means, These include galvanic-skin-re- 
‘Sponse aversive conditioning (Berger, 
1962), verbal response classes (Ditrichs, 
Simon, & Greene, 1967; Kanfer & Mars- 
ton, 1963; Marston, 1964, 1966; Simon, 
Ditrichs, & Jamison, 1965), nonverbal ag- 
Gtession (Bandura & Walters, 1963), and 
discriminative responding (McDavid, 1959; 
‘Paschke, Simon, & Bell, 1967). 
; RA common assumption in such research 
ip 18 that the same fundamental learning 
_ Patameters are operative in both direct 
md Vicarious acquisition, for example, 
frequency, contiguity, etc. However, the 
; extent to which the social learning situa- 
tion introduces additional variables which 


a 
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may facilitate or hinder performance rela- 
tive to direct procedures remains a con- 
tinuing theoretical and empirical question. 
That the former can occur, for example, 
has been demonstrated by Bandura and 
McDonald (1963) who found that chil- 
dren observing adult models expressing 
moral judgments counter to the group’s 
orientation showed significant changes in 
their judgments relative to Ss who were 
directly reinforced for the same deviant 
behavior. Similarly, Elkonin (1957) re- 
ports the results of a study by Pen who 
investigated the vicarious acquisition of 
positive and inhibitory conditioned re- 
flexes in children. Quoting Pen, it is noted 
that “in many cases where a conditioned 
connection could not be produced in the 
child in the usual way (by timing a signal 
just before, or at the same time as, rein- 
forcement) it sprang up easily by imita- 
tion [p. 62].” By contrast, other research 
has demonstrated slower acquisition under 
observational than under direct conditions 
(e.g., Van Wagenen & Travers, 1963). } 
‘An analysis of the factors responsible 
for either the facilitation or inhibition of 
learning under observational conditions 
may be of some importance in suggesting 
possible reasons for observed performance 
differences between populations of “slow 
learners,” for example, retardates, and so- 
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called “normal” populations. For example, 
it has been hypothesized that attentional 
deficits are primarily responsible for the 
inferior performance of the mentally re- 
tarded on selected tasks (Zeaman & 
House, 1963). Since it is explicitly as- 
sumed that attentiveness to the model’s 
behavior is a necessary component of ob- 
servational learning, and since retardates 
have been characterized as being more 
outer-directed than normals (Zigler, 1966), 
one way in which the observational situa- 
tion may enhance learning is by facili- 
tating attentional responses to the task- 
relevant cues. That essentially the same 
interpretation is given in educational re- 
search to account for the learning su- 
periority of overt as against covert re- 
hearsal of materials (Anderson, 1967), 
however, suggests that both direct and 
vicarious procedures may facilitate atten- 
tional responding. More important, these 
conflicting interpretations indicate that the 
conditions under which one is a more ef- 
fective learning technique than the other 
have not yet been delineated. 

Among other deficits, an impoverished 
verbal repertoire has frequently been said 
to characterize the retardate individual. 
This is implied in the difficulty retardates 
have in coordinating motor and verbal be- 
havior, semantic generalization anomalies, 
and lowered verbal learning acquisition 
rates (Denny, 1964). In line with previous 
discussion, then, a potentially fruitful 
area of research would be the systematic 
study of direct versus vicarious verbal 
acquisition in retardate populations. The 
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present experiments were specifically con. | 
cerned with this problem and were designed 
to compare the direct with the vicarious | 
learning of paired-associate (PA) verbal 
materials. Both normals and retardates | 
were employed as Ss, and treatment groups 
were constituted to examine direct trans- 
fer (stimuli and responses identical in 
training and test lists), associative facili- 
tation and interference, and learning-to- 
learn effects, under direct and vicarious 
conditions of first-list learning. 


Experiments I anp II 


Method 


Subjects. In Experiment I, 75 normal children 
were selected from three sixth-grade classes of a 
local junior high school. The three classes repre- 
sented low, average, and high reading ability 
children, respectively, as determined by reading 
comprehension scores on the Iowa Test of Basic 
Skills. In Experiment II, 30 institutionalized male 
familial retardates served as Ss. In both Experi- 
ments II and IV, Ss with known organic brain 
damage and histories of epilepsy were excluded 
from the sample. Information concerning race and 
socioeconomic status was not obtained. Descriptive 
data for all Ss are presented in Table 1. 

Procedure. Several weeks prior to the treatments 
of Experiment I, children in each of the three 
classes were asked to vote for a member of their 
class to whom they would listen most if he or she 
were allowed to be the teacher for a day. The 
child receiving the greatest number of votes 1D 
each of the classes, two girls and a boy, then made | 
four tape recordings with EZ, a male graduate st- 
dent. In each of these recordings the selected child 
acted as an Sin a simulated PA learning task. The | 
materials presented to the taped S were six bigram« 
word PAs. Stimulus bigrams were of low intralist 
similarity and were selected from norms provide 
by Underwood and Schulz (1960). Response words 


TABLE 1 
Means, Stanparp Deviations, anp Ranaus or Aaus, INTELLIGENCE, INSTITUTIONALIZATION, AND 
Reaping ComprenENsION ScoRES 


‘ Reading > Institutionalization 
Emer! sonjets [ow | Tntaligence® | compreenson® om) 
M | SD| Range | uw | sD | Range | | SD | Range | M | SD | Rare’ 
POURS Aus ae ees eR | 2 TY Peze | 
I | Normals by 
gets level apilineta 
Ww -6 | 11.0-13.5 | 98.3 | 13.8] 77-128 | 4.8 | 0.9 | 3.0-6.7 
Average 25 ae ns eres 106.4 10.6 | 87-126 | 6.8 | 1.0 | 5.2-8.6 
iL Boaeiies 80 | isa | 126 | weeage | od | 84] 108g | 87 | 8 | ort0.7) | oy | gaat 
TI | Normals 144 | 10.2] 1.1 | 9.1-11.3 | 112:2 | 10:1 | 101-120 o-142 
IV_ | Retardat 96 | 15.0} 0.9 | 11.2-18.1 | 52.2 | 974 7 5.8 | a1 | 2 
® Based on Otis, WISC, or Kuhlmann-Anderson tests for normals tanford-Bi _, WISC, WAIS, and Merl 
poe tests ae retardates Since these scores were taken from s school or Sootitutionel t mate and woe bieed upon numerous 
men are not re 


> Grade-equivalent scores. 
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were culled from children’s word association norms 
(Palermo & Jenkins, 1964) and were all AA words 
equated for Thorndike-Lorge frequency of oc- 
currence on the G count. Following a single 
presentation of the six PAs, the taped S responded 
with the correct associate following each of the 
six bigrams for nine successive errorless trials. Five 
random orders of the PAs, replicated twice, were 
presented to the taped S. The taped-presentation 
sequence was as follows: E spelled the bigrams, S 
responded with the appropriate word within a 3- 
second anticipation interval, Z repeated the bigram 
and pronounced the correct response word. The 
interitem interval was 3 seconds. A 6-second inter- 
trial interval was employed with a bell being 
sounded midway through this interval to signal 
the start of a new trial. 

All Ss were run individually in a room provided 
by the school. Upon entering the room, Ss were 
first required to learn a three-item PA list con- 
sisting of single number-single letter pairs to a 
criterion of 2 successive errorless trials or for a 
maximum of 25 trials. All Ss tested were able to 
learn these PAs within the 25-trial limit. Instruc- 
tions as well as stimulus and response items were 
presented through headphones via tape. Following 
this warm-up task, each S was told that he was 
going to hear a tape recording of one of his class- 
mates learning some letters and words. Instructions 
to 8 were simply to listen to the taped learning 
Session. The S then heard his classmate receive 
PA instructions followed by one of the four pre- 
recorded tasks. Immediately following this listen- 
ing session, S was instructed to learn some material 
himself in the same way that he had just heard 
his classmate learn, A PA transfer task followed 
these instructions and $ was required to learn this 
list to a criterion of 3 successive errorless trials or 
for a maximum of 40 trials. The transfer task was 
administered via tape at presentation rates identi- 
tal to those employed in the vicarious session. 

e experimental design called for the use of 
five treatment groups. For Group E-1, a direct 
transfer group, the stimulus and response members 
of the training tape were identical to those em- 
Ployed on the transfer tape. For Group E-2, a 
Mediated facilitation group, the stimulus members 
i" the training and transfer tapes were identical, 
ut the response members of the transfer tapes 
ae dominant word associates of the training 

Pe responses as determined from the children’s 
Word association norms employed. For Group E-3, 
80 associative interference group, the stimulus 
Members of the two tapes were the same but the 
Goons were words having no associative rela- 
sun to one another. For Group C-1, the 

oe and response members of the training 
i transfer tapes differed from each other. This 
aie Provided a reference level with respect to 
ak the differential transfer effects of the experi- 
Es Al groups could be evaluated. A second control 
nee C2, Teceived a digit cancellation task in 
the gt Vicarious PA experience for the duration of 

The mae Phases of the other groups. 

© PA materials to which Group E-1 listened 
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during the vicarious phase, as well as those 
presented to all groups in the transfer list were: 
xP-low, MB-chair, ua-fast, Ki-sleep, FI-white, and 
sc-flower. These same bigrams were paired with 
the following response words during the vicarious 
phase for Groups H-2 and E-3, respectively: high, 
table, slow, bed, black, blossom; music, cheese, 
king, long, girl, and memory. Group C-1 was ex- 
posed to the following PAs during the vicarious 
phase: aF-music, Gl-cheese, LU-king, pH-long, RV- 
girl, and wN-memory. 

In Experiment Il, institutionalized male re- 
tardates were treated in essentially the same 
manner as were the normal Ss with the following 
exceptions: (a) The Ss were asked to vote for a 
member of their cottage to whom they would 
listen most if he were allowed to be a teacher 
for a day. The male retardate selected then 
served as S in all tape recordings of the major 
experimental conditions; (b) the experiment was 
conducted in a mobile trailer situated on the 
institution grounds; (c) a female EZ was used; 
(d) a 2,000 eps. tone was sounded 1 second prior 
to the onset of each stimulus bigram throughout 
the training and transfer tasks. A pilot study had 
revealed that such a procedure was necessary to 
hold the attention of the retardates during the 
aurally presented PAs. Other aspects of the pro- 
cedure, including the stimulus materials employed, 
rate of presentation, etc., were identical to those 
employed with normal Ss. 


Results 


The major dependent variables were 
mean correct responses and trials to cri- 
terion. Since essentially comparable find- 
ings were obtained with the two measures 
(for normals, r = —.85; for retardates, 
r = —.90), only results employing the 
former measure will be presented. 

Experiment I. Figure 1 shows the mean 
number of correct anticipations on the eri- 
terion list by blocks of five trials for all 
groups. A repeated measurements analysis 
of variance on Blocks 1 and 2 was first 
conducted to obtain a measure of group 
differences unconfounded by possible ceil- 
ing effects, that is, Ss in all groups reach- 
ing criterion. The results of this analysis 
indicated that Reading Levels (F = 6.18, 
df = 4/60,p < 01) and Blocks CS 
151.59, df = 1/60, p < .001) were signifi- 
cant sources of variance. The Groups effect 
fell short of statistical significance (F¥ = 
2.04, df = 4/60, p < .10). None of the 
interactions was statistically reliable. As 
can be seen in Figure 1, there is a tendency 
for Ss in Group E-1 to give more correct 
responses than Ss in Group C-2, with all 
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MEAN CORRECT ANTICIPATIONS PER BLOCK 


BLOCKS OF FIVE TRIALS 


Fic. 1. Mean number of correct anticipations 
per block of five trials on the paired-associate test 
list for normal Ss (Experiment I). 


other treatment group comparisons yield- 
ing correspondingly higher p values. 
Tukey’s test of multiple comparisons indi- 
cated that Ss in the high reading level 
groups (M = 3.90) gave more correct re- 
sponses than Ss in the average (M = 
2.97) and low (M = 2.79) reading level 
groups (p < .05 and p < .01, respec- 
tively) ; the latter two groups did not dif- 
fer significantly from each other. An 
analysis over Blocks 1-4 yielded results 
essentially the same as those obtained over 
Blocks 1 and 2. 

Error analyses over Blocks 1 and 2 were 
performed separately for omissions and 
intralist intrusions. For the omissions 
analysis, Reading Levels and Blocks were 
the only significant sources of variance 
(F = 3.18, df = 2/60, p < 05: F = 
180.55, df = 1/60, p < .001, respectively). 
Tukey tests indicated that Ss in the high 
reading level group (M = 3.72) made 
significantly fewer omissions than Ss in 
the middle (M = 6.28) and low (M = 
6.54) reading level groups (p < .05), with 
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other differences failing to reach statistical 
significance. There were reliably fewer 
omissions made on Block 2 (M = 3.20) 
than on Block 1 (M = 7.90) of the test 
list. A similar analysis on intralist errors 
indicated that Blocks was the only signifi- 
cant source of variance (F = 24.05, df = 
1/60, p < .001). The means for Blocks 1 
and 2 were 8.70 and 6.59, respectively. 

To ascertain the relative transfer ef- 
fects of vicarious versus direct PA ex- 
perience, the performance of Group E-1 
on Trials 1-5 of the transfer list was com- 
pared with the performance of Group C-2 
on Trials 10-14. Since Group C-2 received 
direct learning experience on the criterion 
list for Trials 1-9 and Group E-1 re- 
ceived vicarious experience for nine trials 
prior to the transfer phase, this compari- 
son appropriately tests for the relative 
transfer effects of vicarious versus direct 
experience early in criterion list learning, 
A factorial analysis of variance (Groups x 
Reading Levels) on mean correct antici- 
pations on Trials 1-5 for Group E-1 and 
on Trials 10-14 for Group C-2 revealed 
that reading levels was the only signifi- 
cant source of variance. While the group 
means indicated a tendency for the C2 
group (M = 19.17) to perform better than 
the E-1 group (M = 15.89), the F' ratios 
for Groups (F = 2.30, df = 1/24, p > .05) 
and for the Groups X Reading Levels in- 
teraction (F = .21, df = 2/24) were not 
significant. 

Experiment II. Figure 2 presents the 
mean number of correct anticipations 00 
the criterion list by blocks of five trials 
A repeated measurements analysis ° 
variance on Blocks 1 and 2 indicated thet 
only the main effect for Blocks was signifi- 
cant (F = 43.14, df = 1/25, p < .000). 
As with the results for normal Ss, be 
Groups effect fell just short of statistie’ 
significance (F = 2.52, df = 4/25, P 
10), Group E-1 tending to give ne 
correct responses than Ss in the othe 
four treatment groups. oe 

Since relatively few numbers of wis 
date Ss reached criterion within the at 
trial limit, a Groups x Blocks analysis © 
variance was also conducted on the er 
ber of correct responses over Blocks }-* 


MEAN CORRECT ANTICIPATIONS PER BLOCK 
a 
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The results of this analysis indicated that 
Blocks was the only significant source of 
variance (F = 75.55, df = 7/175, p < 
001). 

An analysis of variance on number of 
omissions made over Blocks 1 and 2 indi- 
cated that only Blocks was significant 
(F = 31.80, df = 1/25, p < .001). Re- 
spectively, the means of Blocks 1 and 2 
were 19.63 and 12.50. A similar analysis 
on intralist intrusions showed that Blocks 
was again the only significant ratio (F = 
585, df = 1/25, p < .05). For retardate 
$s, there was a significant increase in 
intralist errors from Block 1 (M = 5.18) 
to Block 2 (M = 6.97) of the test list. 

At test was performed between the mean 
correct anticipations made on Trials 1-5 
of the transfer list for Group E-1 and the 
corresponding performance of Group C-2 
on Trials 10-14. Consistent with the re- 
sults obtained for normal Ss, there was & 
tendency for the C-2 group (M = 9.17) to 
make more correct responses than the E-1 
group (M = 6.67), but these means did 
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30, 


J 
sm 


rr BLOCKS OF FIVE TRIALS 
ie . 2. Mean number of correct anticipations 
list fo lock of five trials on the paired-associate test 
t retarded Ss (Experiment II). 
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not differ significantly from one another 
(t = 1.32, df = 10, p > .05). 


Experiments III anp IV 


The results of Experiments I and II are 
equivocal in demonstrating the relative 
superiority of either direct or vicarious 
acquisition as well as in demonstrating 
whether vicarious PA learning occurs at all. 
At best, it could be said that, for both 
populations, direct learning produces 
somewhat better transfer performance than 
vicarious learning and that vicarious ex- 
perience tends to improve performance on 
transfer pairs comprising the same stimuli 
and responses as in first-list “learning.” 
However, both of these statements lack 
statistical confirmation. Consequently, Ex- 
periments III and IV were conducted to 
examine additional variables which may 
provide more definitive answers to these 
questions. 

As stated earlier, a fundamental premise 
of the present research has been that, 
given the same task, those variables found 
to be significant in direct learning should 
also be operative under vicarious condi- 
tions. Thus, if it is assumed that the failure 
to find significant transfer in Experiments 
I and II under vicarious conditions is at- 
tributable to insufficient amounts of List-1 
learning, then two variables which are 
likely to facilitate such learning are an 
increased number of observational trials 
and directed instructions to Ss. That the 
former is a significant variable under ob- 
servational conditions has been demon- 
strated in at least two discrimination learn- 
ing studies employing retardates as Ss 
(Fletcher, 1966; Paschke, Simon, & Bell, 
1967). Likewise, instructions have consist- 
ently been found to be a potent learning 
variable in both direct (e.g. McLaughlin, 
1965) and vicarious (e.g., Marston, 1966) 


pecpaucareni III and IV, Ss received 
either 5 or 15 observational trials, half 
under instructions to listen to the model 
and half under instructions to learn the 
materials the model was learning. The 
scholastic competence of the model was 
also manipulated as a third independent 
yariable for normal Ss. It was expected 
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that 15 trials and instructions to learn 
would, either as main effects or interact- 
ing variables, result in significant transfer 
under conditions of vicarious List-1 ac- 
quisition by increasing the degree of learn- 
ing relative to Experiments I and II. To 
permit more precise comparisons with 
performance under nonobservational con- 
ditions, direct learning groups, as well as 
controls, were included in the design. 


Method 


Subjects. 154 normal children drawn from the 
fourth, fifth, and sixth grades of local schools and 
114 institutionalized male and female familial re- 
tardates initially served as Ss. Descriptive data 
for both populations is presented in Table 1. 

Procedure. The general procedures in Experi- 
ments III and IV were identical and were the 
same as those employed in Experiments I and II 
with the following exceptions: (a) The same adult 
male and sixth-grade female child served as E 
and model, respectively, for all vicarious learning 
(VL) Ss; this Z also prerecorded List-1 materials 
for direct learning (DL) Ss, as well as List-2 
materials for all Ss. (b) A mixed transfer design 
was employed, Stimuli for the two lists of PAs 
were two-digit even numbers ranging from 10 to 
20, List-1 responses were fast, sleep, soldier, eagle, 
king, and music. Test-list interference words were 
white and girl; control words were high and table, 
(c) A 2,000-cps tone preceded each of the PA 
stimuli for all Ss; (d) An 8-second intertrial 
interval was utilized. (e) The test list was learned 
to a criterion of 2 consecutive errorless trials or 
for a maximum of 35 trials. Further, to avoid 
score estimations for fast learners, all Ss were run 
for a minimum of 10 trials on the test list. 

For various reasons, for example, failure to 
learn the practice list, failure to cooperate at some 
point in the experiment, Z mistake, 10 normal Ss 
and 18 retardates were eliminated from the experi- 
ment, leaving total populations of 144 normals 
and 96 retardates. Those eliminated were approxi- 
mately equally distributed among the various 
treatment conditions, 

p Three major independent groups were employed 
in each of the two experiments: VL Ss, DL Ss 
and controls. For VL Ss in Experiment It, the 
design was a 2 (instructions—“listen” versus 
“learn”) X 2 (number of vicarious trials—5 versus 
15) X 2 (model's scholastic competence—high 
versus low) X 3 (conditions of transfer—direct 
transfer, interference, learning-to-learn) mixed fac- 
torial with the first three of these variables con- 
stituting between-Ss effects and the latter being a 
within-Ss variable. The scholastic competence of 
the model was varied by instructing half of the 
Ss that the girl learning on tape was “a very 
smart student and always gets very high grades,” 
The other half of the Ss were told that the girl 
as “not a very smart student and always gets 
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very low grades.” The Ss in the “intentional” 
group were instructed to try to learn the words 
along with the taped S$ because afterwards they 
would be asked to do the same thing. “Incidental” 
Ss were instructed simply to listen to how the girl 
on tape did. Twelve Ss were randomly assigned to 
each of the 8 cells with the restriction that ap- 
proximately equal numbers of males and females 
appear in each cell. 

DL Ss were required to learn List 1 directly in 
lieu of listening to the model’s performance. To 
permit comparisons with VL Ss, half of these Ss 
received 5 List-1 anticipation trials while the 
other half of the Ss received 15 List-1 trials prior 
to the test list. Finally, two groups of control 83 
played an “etch-a-sketch” game for lengths of 
time corresponding to List-1 learning times for 
DL-5 and DL-15 Ss, respectively. There were 12 Ss 
in the two DL and two control groups. 

Experiment IV was identical in design to Ex- 
periment III with the exception that retardates 
were employed as Ss and the scholastic competence 
of the model was not manipulated as an independ- 
ent variable for VL Ss. Instructions to these Ss 
simply made reference to “a girl learning some 
numbers and words.” 


Results 


As in Experiments I and II, the major 
dependent variables were mean correct 
anticipations and trials to criterion. Results 
employing trials to criterion will be pre- 
sented only when discrepancies exist with 
the former measure. 

Experiment III. To insure the com- 
parability of the criterion items, as well 
as the existence of reliable transfer effects 
under direct learning conditions, a 2 
(groups—direct learning versus controls) 
X 2 (5 versus 15) x 3 (transfer condi- 
tions) X 5 (trials) repeated measurements 
analysis of variance was conducted. Since 
normal Ss approximated asymptotic pel- 
formance by the fifth trial, only Trials 
1-5 were examined. This analysis revealed 
significant effects for Transfer Conditions 
(F = 4.06, df = 2/88, p < .05), Groups x 
Transfer Conditions (F = 4.78, df = 2/88, 
p < .05), and Trials (F = 38.26, df = 
4/176, p < .001). To evaluate the nature of 
the interaction, separate 2 x 3 x 5 analy- 
ses of variance were conducted for DL a0 
control groups. With the exception of the 
trials effect, the latter analysis yielded 20 
significant results. In addition to trials, 
the main effects for Transfer Conditions Wa 
significant for DL Ss (p < .01), the meats 
being 1.70, 1.48, and 1.42 for direct trans 
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fer (DT), interference (INT), and new 
(LTL) items, respectively. Results are 
presented in the upper panels of Figure 3. 
In the foregoing analysis, the Groups x 
Transfer Conditions x Number of Vicari- 
ous Trials interaction was marginally sig- 
nificant (p < .10), there being a tendency 
for greater DT and more INT to occur with 
15 List-1 trials than with 5 trials. This 
trend was found to be significant in the 
trials to criterion analysis, the Groups x 
Number of Vicarious Trials x Transfer 
Conditions interaction being statistically 
significant (fF = 3.29, df = 2/88, p < 
05). As in the correct anticipations analy- 
sis, control group analysis revealed no sig- 
nificant ratios. A similar analysis on the 
DL groups, however, indicated that Trans- 
fer Conditions (p < .05) and the Transfer 
Conditions x Number of Vicarious Trials 
interaction (F = 3.65, df = 2/22, p < 


NORMALS 
DIRECT LNG, 


CN = 24) 
CONTROLS 


| 3 5 
TRIALS 


O—— DIRECT TRANSFER 

@- —— INTERFERENCE 

&—— _ LEARNING - TO- LEARN 
VICARIOUS _ GROUPS (N=48) 

INST: LISTEN 


CORRECT ANTICIPATIONS 


INST: LEARN 


MEAN 
1 


| 3 5 I! 3 5 
TRIALS 


ae Mean number of correct anticipations 
Paired. nsfer. pair on the first five trials of the 
ing vies ccm test list for normal direct Tearn- 
WH), \earious learning, and control Ss (Experiment 


DIRECT LEARNING 


DBIRECT_ “LEARNING. 
4— CONTROL, 
@— 5 LIST | TRIALS 


O——I5 LIST | TRIALS 


MEAN TRIALS TO CRITERION 


LTL 


OT INTER 
TRANSFER CONDITIONS 
Fic. 4. Mean trials to criterion on the paired- 
associate test list for normal Ss as a function of 
transfer conditions and number of direct List-1 
learning trials (Experiment III). 


.05) were significant sources of variance. 
This interaction, along with the combined 
performance of control Ss, is presented in 
Figure 4. As can be seen, the interaction 
reflects the fact that significant INT 
effects were obtained with 15 trials on 
List 1 but not with 5 List-1 trials. 

Error analyses on total omissions and 
total intralist intrusions were essentially 
consonant with the correct anticipations 
data. For omissions, the Groups x Trans- 
fer Conditions interaction was the only 
significant ratio (F = 9.23, df = 2/88, 
p < 001). For DL Ss, the means for 
DT, INT, and LTL pairs were 1.18, 1.96, 
and 1.96, respectively ; for control Ss, the 
means were similar and averaged 1,86. 
DL Ss (M = .60) also made fewer in- 
trusions than did controls (M = 1.17) 
(F = 4.88, df = 1/44,7 < .05). 

The performances of VL Ss were next 
examined by conducting a 2 (listen versus 
learn) X 2 (number of vicarious trials) x 
2 (peer status) X 3 (transfer condi- 
tions) x 5 (trials) mixed analysis of 
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Fig. 5. Mean number of correct anticipations 
per transfer pair on Trials 1-10 of the paired-asso- 
ciate test list for retardate direct learning, vicari- 
ous learning, and control Ss (Experiment IV). 


variance on test list anticipation scores. 
Results indicated significant effects for 
Number of Vicarious Trials (F' = 7.43, df = 
1/88, p < .01), Transfer Conditions (F = 
19.51, df = 2/176, p < .001), Trials (F = 
113.75, df = 4/352, p < 001), Transfer 
Conditions x Trials (F = 4.94, df = 
8/704, p < .001), and Instructions x 
Transfer Conditions x Trials (F = 2.71, 
df = 8/704, p < .01). Reliably more cor- 
rect responses were made with 15 vi- 
carious trials (M = 1.52) than with 5 
trials (M = 1.31). The triple interaction 
is presented in the lower panels of Figure 
3. As is evident, instructions to learn sig- 
nificantly facilitated performance on DT 
items and resulted in a greater (although 
insignificant) tendency for INT to occur 
on Trial 1 as compared with instructions 
to listen. In agreement with the trends 
noted in Experiment I, overall perform- 
ance under both instructional conditions is 
somewhat below the performance of DL 
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Ss, the combined means for DT, INT, and 
LTL pairs being 1.59, 1.30, and 1.36, re- 
spectively. 

A similar analysis on total omissions 
revealed that Transfer Conditions was the 
only significant source of variance (F = 
13.29, df = 2/176, p < .001). The means 
for DT, INT, and LTL pairs were 1.20, 
2.08, and 1.88, respectively, the latter two 
not differing significantly from each other. 
For total intralist intrusions significant 
effects were obtained for Number of Vi- 
carious Trials (F = 5.96, df = 1/88, p< 
.05) and Transfer Conditions (F = 5.12, 
df = 2/176, p < .01). The means for 5 
and 15 vicarious trials were 1.49 and .79, 
respectively. For DT, INT, and LTL pairs, 
the means were .83, 1.27, and 1.31. 

Experiment IV. A 2 (direct learning 
versus controls) x 2 (number of List-1 
trials) x 3 (transfer) x 10 (trials) mixed 
analysis of variance on performance over 
Trials 1-10 revealed that Trials (F = 
262.75, df = 9/396, p < .001) was the 
only significant source of variance. Thus, 
contrary to results obtained with normal 
Ss, retardates failed to show significant 
test list transfer as a result of directly 
learning an initially related list of PAs 
for either 5 or 15 trials. For DL Ss, the 
overall means for DT, INT, and LTL 
pairs were 1.12, 1.00, and 1.01, respec- 
tively; the corresponding means for con- 
trols were .95, 1.00, and .85. 

A 2 (listen versus learn) x 2 (number 
of vicarious trials) x 3 (transfer) * 
10 (trials) mixed analysis of variance for 
VL groups indicated significant effects for 
Transfer Conditions (F = 15.94, df = 
2/88, p < .001), Trials (F = 29.80, df = 
9/396, p < .001), and Transfer Condi- 
tions X Trials (F = 1.64, df = 18/792, 
p < .05). As can be seen in Figure 5, 
more correct responses were made to D 
pairs (M = 1,23) than to INT (M = .88) 
and LTL (M = .88) pairs. For comparisoa 
purposes, the performances of DL and 
control Ss are also presented in this figure. 

As in Experiment III, error analyses weté 
consonant with correct response data. 
Comparing DL and control Ss, analyse 
of variance on the first two blocks of five 
trials revealed that Blocks was the only 


f 


ft 


of hh 


mnificant source of variance for both 


Ss! 

) and intralist intrusions (F = 10.81, 
of 1/44, p < .001). In the omissions 
analysis for VL Ss, Transfer Conditions 


(F = 3.97, df = 2/88, p < .05) and Blocks 


ms for DT, INT, and LTL items were 
2.57, and 2.53, respectively. For in- 
ms, significant ratios were obtained 
lor Transfer Conditions (F = 8.30, df = 
2/88, p < .001) and Transfer Conditions x 
locks (F = 4.55, df = 2/88, p < .05). 
‘The latter reflects decreasing differences 
“between DT, INT, and LTL items from 


Block 1 to Block 2. The overall means 


were 1.65, 2.30, and 2.57, respectively. 


A 
Ay, Discussion 


_ The major goals of this research were 
(@) to demonstrate the existence of vi- 
‘‘atious PA learning in normals and re- 
fardates, (b) to examine the relative ef- 
ficacy of direct versus vicarious procedures, 
and (c) to study some of the variables 


Which may be of common importance in 


both these conditions of acquisition. 

With reference to the first aim, the 
Present results indicate that, following 
‘exposure to the errorless performance of a 


~ Model, both normal children and retardates 


will learn the same PAs faster in transfer 
than will Ss having no prior PA experience. 
‘The marginally significant trends for direct 
& sfer pairs in Experiments I and II, 
as Well as the reliable findings in Experi- 
Ments III and IV, jointly support this 
Ponclusion. It is also clear, however, that 
‘structional conditions during the obser- 
‘Vational trials are of primary importance 
‘™M producing this effect for normals but 
ot for the mentally retarded. Thus, for 
eee Ss, instructions to learn the ma- 
i the model was learning produced 
a ble transfer in Experiment III, whereas 
: Instructions yielded nonsignificant 
4nd consistently lower levels of perform- 
nee in Experiments I and III. Independent 
low this instructional effect is con- 
€ptualized, it is important to note that 
Ten yudings are consonant, with results 
Ported in the intentional-incidental learn- 


** 
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ing literature and suggest some measure 
of correspondence between the two areas 
of investigation. That facilitation was ob- 
tained in Experiment IV independent of 
instructional conditions supports the notion 
that retardates are more outer-directed 
than normals (Zigler, 1966) and indicates 
that under conditions designed to facilitate 
orienting responses to the task-relevant 
cues, for example, exposure to the selective 
behavior of a social model, incidental-in- 
tentional learning differences fail to ap- 
pear. 

Both associative facilitation and inter- 
ference effects have been reported for nor- 
mal children (eg., Norcross & Spiker, 
1958) and retardates (e.g., Berkson & 
Cantor, 1960; Rieber, 1964) when items 
are learned directly in all stages of PA 
transfer paradigms. Under the present con- 
ditions of vicarious List-1 learning, how- 
ever, no such effects were obtained. Herein, 
it should be noted that while a trend 
towards interference seems to occur on 
Trial 1 for intentional learning Ss in Ex- 
periment III, (a) this trend was not 
statistically reliable, and (6) there was an 
operant level tendency for INT items to 
be somewhat more difficult than DT and 
LTL items in all groups, including con- 
trols. i 

Data relevant to the second aim of this 
research, that is, differential rates of learn- 
ing under direct and vicarious conditions, 
appear to provide reasons for these dis- 
erepancies. For normal Ss in Experiments 
I and III, there is a consistent trend for 
test performance to be higher when List-1 
pairs are learned directly than when they 
are learned vicariously. With the excep- 
tion of DT pairs in Experiment IV, the 
same trend was obtained for retardates 
in Experiments II and IV. If PA learning 
is now conceptualized as a two-stage 
process (Underwood & Schulz, 1960), then 

+ conditions conducive to 
low levels of List-1 learning, for example, 
vicarious procedures, will primarily favor 
the response learning stage, whereas con- 
ditions producing higher levels of learning, 
for example, direct procedures, will tend 
to facilitate the associative learning stage. 
Thus, in the absence of appreciable second- 


it is clear tha 
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stage learning, neither associative facilita- 
tion nor interference would be expected. 
Indeed, with primarily response learning, 
the only items expected, and found, to 
show reliable transfer would be DT pairs. 
This interpretation is highly consistent 
with Martin’s (1965) more extensive theo- 
retical treatment of verbal paired-associate 
transfer and is consonant with a study re- 
cently reported by Winnick (1966) who 
failed to find differences between LTL and 
A-Br paradigms with a 50% List-1 learn- 
ing criterion. 

Additional support for this interpreta- 
tion is provided by noting that reliable 
interference was obtained only in normal 
Ss receiving 15 List-1 trials directly. An 
examination of performance on List 1 indi- 
eates that the mean number of correct 
responses per transfer pair for the five 
blocks of three trials is 1.24, 1.66, 1.82, 
1,86, and 1.95, respectively. Further, 88% 
of these Ss reached an overall criterion of 
at least five-sixths by the sixth trial. For 
retardate Ss, the corresponding means for 
Blocks 1-5 are .75, 1.24, 1.38, 1.58, and 1.56, 
with 50% of the Ss reaching the same cri- 
terion by the seventh trial and 68% by the 
fifteenth trial. For both populations, the 
means for DT, INT, and LTL pairs are 
similar over all trials. It is clear, therefore, 
that significant interference occurred under 
conditions of relatively marked first-list 
learning and that such conditions were nei- 
ther present in the majority of direct learn- 
ing retardate Ss nor likely to be present in 
any S's learning List 1 vicariously. 

Seemingly contradicting the foregoing 
reasoning concerning the superiority of 
direct as against vicarious acquisition, 
VL Ss in Experiment IV showed signifi- 
cant transfer to DT pairs whereas DL Ss 
were no different from controls on any 
of the items. Since previous discussion must 
necessarily eliminate insignificant amounts 
of List-1 learning as primarily responsible, 
it is likely that these differences are at- 
tributable to conditions of testing. Specifi- 

cally, the differences may be due to the 
“multiple rule” requirement of the mixed- 
list transfer design. This refers to the fact 
that with this design as many strategies 
are required of Ss as there are transfer 
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paradigms. It has been shown that relative 
to unmixed lists, performance with mixed 
lists is slightly poorer in a number of 
paradigms (Postman, 1966), including 
A-B, A-B (Slamecka, 1967). Presumably 
this is a result of interference produced 
by attempts to apply multiple rules to 
differing transfer pairs. Relative to VL 8s, 
and assuming correspondingly greater 
amounts of associative learning on List 1, 
the somewhat depressed performance of 
DL Ss on DT pairs would appear to agree 
with these findings. That the same decre- 
ment was not noted for DL Ss in Experi- 
ment III, however, may be due to the 
fact that there was considerably greater 
List-1 learning for these Ss and/or that 
normals are better able to overcome this 
source of interference than retardates. 

The significant facilitation obtained 
under conditions of vicarious List-1 learn- 
ing in Experiment III was independent of 
instructions concerning the scholastic com- 
petence of the model. In Experiments I 
and II, preferred peers selected by Ss from 
their own classrooms or cottages to serve 
as models were equally ineffective in pro- 
ducing reliable transfer following vicarious 
List-1 learning. While other research has 
frequently found this variable to be of con- 
siderable importance in observational 
learning (Bandura & Walters, 1963), the 
present results suggest that more tradi- 
tional verbal learning variables are likely 
to be of greater importance in vicarious 
paired-associate transfer. 

In agreement with results reported by 
Otto (1961), reading level was signifi- 
cantly related to PA test performance 10 
Experiment I: Ss scoring high on a read- 
ing comprehension test learned the PA list 
significantly faster than Ss scoring in either 
the middle or low ranges of the reading 
test. Inasmuch as reliable treatment effects 
were not obtained in this study, howevel 
statements concerning possible difference 
in vicarious learning for these three groups 
must await further research. 


REFERENCES 


Anpsrson, R. C. Educational Psychology. Annual 
Review of Psychology, 1967, 18, 129-164. iy 

Banpura, A, & McDonatp, F. J. The influene 
of social reinforcement and the bebavior 


Parrep-ASsociaTe LEARNING IN ReraRpaTes anpD NorMats 


models in shaping children’s moral judgments. 
Journal of Abnormal and Social Psychology, 
1963, 67, 272-281. 

Banpura, A., & Wavrers, R. H. Social learning 
and personality development. New York: Holt, 
1963. 

Bercer, S. M. Conditioning through vicarious 
instigation. Psychological Review, 1962, 69, 450-— 
466. 

Berkson, G., & Cantor, G. N. A study of media- 
tion in mentally retarded and normal school 
children. Journal of Educational Psychology, 
1960, 55, 129-134. 

Denny, M. R. Research in learning and perform- 
ance. In H. Stevens & Heber, R. (Eds.), 
Mental retardation. Chicago: University of Chi- 
cago Press, 1964. 

Dimaicus, R., Stvon, S., & Greenz, B. Effect of 
vicarious scheduling on the verbal conditioning 
of hostility in children. Journal of Personality 
and Social Psychology, 1967, 6, 71-78. 

Etxony, D. B. The physiology of higher nervous 
activity and child psychology. In B. Simon (Ed.), 
Psychology in the Soviet Union. Stanford: Stan- 
ford University Press, 1957. 

Turtcner, H. J. Discrimination learning by re- 
tardates as a function of number of implicit 
Tesponse trials. Psychonomic Science, 1966, 4, 
159-160. 

Kanrer, F. H., & Marston, A. R. Human rein- 
forcement: Vicarious and direct. Journal of Ez- 
perimental Psychology, 1963, 65, 292-296. 

Marston, A. R. Variables in extinction following 
Acquisition with vicarious reinforcement. Journal 
of Experimental Psychology, 1964, 68, 312-315. 

Marston, A. R. Determinants of the effects of 
Vicarious reinforcement. Journal of Experimental 
Psychology, 1966, 71, 550-558. 

Manny, E. Transfer of verbal paired associates. 
Psychological Review, 1965, 72, 327-343. 

McDav, J. W. Imitative behavior in preschool 
children. Psychological Monographs, 1959, 73(16 
Whole No. 486). 

Lavan, B. “Intentional” and “incidental” 
eaming in human subjects. Psychological Bul- 
letin, 1985, 63, 359-376. 

oRcRoss, K. J., & Sprxer, C. C. Effects of me- 


349 


diated associations on transfer in paired-associate 
learning. Journal of Experimental Psychology, 
1958, 55, 129-134. 

Orto, Ww. The acquisition and retention of paired 
associates by good, average, and poor readers. 
Journal of Educational Psychology, 1961, 52, 
241-248, 

Patzrmo, D. §., & Jenxuys, J. J. Word association 
norms grade school through college. Minneap- 
olis: University of Minnesota Press, 1964. 

Pascuxe, R. E., Summon, S., & Beut, R. W. Vicarious 
discrimination learning in retardates, Journal of 
Abnormal Psychology, 1967, 72, 536-542. 

Postman, L. Differences between unmixed and 
mixed transfer designs as a function of paradigm. 
Journal of Verbal Learning and Verbal Behavior, 
1966, 5, 240-248, 

Rueser, M. Verbal mediation in normal and re- 
tarded children. American Journal of Mental 
Deficiency, 1964, 68, 634-641. 

Smon, S., Drrricus, R., & Jamison, N. Vicarious 
learning of common and uncommon associations 
in children. Psychonomic Science, 1965, 3, 345- 


346. 

Stamecka, N. J. Transfer with mixed and unmixed 
lists as a function of semantic relations. Journal 
of Experimental Psychology, 1967, 73, 405-410. 

Unverwoon, B. J., & Scxutz, R. W. Meaningfulness 
and verbal learning. Philadelphia: Lippincott, 
1960. 

Van Wacenen, R. K., & Travers, R. M. Ww. Learn- 
ing under conditions of direct and vicarious re- 
inforcement. Journal of Educational Psychology, 
1963, 54, 356-362. Bh en 

Winnicx, W. A. Effect of instructional set and 
amount of first learning on negative transfer. 
Journal of Experimental Psychology, 1966, 715 
920-923. i 

Zpaman, D., & House, B, J. The role of attention 
in retardate discrimination learning, In N. R 
Ellis (Ed.), Handbook of mental deficiency. 
New York: McGraw-Hill, 1963. ‘ 

Zrcuzr, B. Research on personality structure in the 
retardate. In N. R. Ellis (Ed.), International 
review of research in mental retardation. Vol. I. 
New York: Academic Press, 1966. 


(Received August 10, 1967) 


Journal of Educational Psychology 
1968, Vol. 59, No. 5, 350-354 


EXTRA-SCOPE TRANSFER IN LEARNING MATHEMATICAL — 
STRATEGIES’ 


JOSEPH M. SCANDURA anp JOHN H. DURNIN 
University of Pennsylvania 


The notion of a restricted rule or strategy was introduced. It was 
hypothesized that extra-scope transfer depends on the extent to 
which a statement of strategy may be viewed as a restriction of a ‘ 
more general strategy. 66 high school Ss were taught a restricted f 
statement (S’, SG’, or G’) of 1 of 3 strategies of varying generality, 
8 (= 8’) < SG (SG’) < G (G’). 22 Ss served as a control (C). All " 
Ss were tested on 6 problems, the 1st 2 within the scope of the most Ne 
specific strategy (S), the 2nd 2 within the scope of only the more \ 
general strategies (SG and G), and the last 2 only within the scope 
of strategy G. Statements S’, SG’, and G’ were directly applicable 
only to the Ist 2 problems. Groups SG’ and G’ evidenced extra-scope Ee 
transfer, Groups S’ and C did not. In addition, performance on the 
2nd problem of each pair was contingent on performance on the 
corresponding 1st problems indicating that “what is learned” may 
be determined by performance on single test items and used to pre- 
dict performance on additional similar-scope problems. Suggestions “4 


were made for future research. 


Scandura, Woodward, and Lee (1967) 
demonstrated that performance on transfer 
tasks is generally in accord with the logi- 
cally determined scope of rule and strategy 
statements.” In each of two experiments, 
Ss were presented with one of three state- 
ments of rules (or strategies) of varying 
generality and were tested on three prob- 
lems. The first problem was within the 
scope of all three rules; the second, within 
the scope of only the two more general 
tules; and the third, within the scope of 
only the most general rule. In most. in- 
stances, there was essentially no difference 
in the level of performance on the within- 
scope problems and no extra-scope transfer 
(to problems to which the rule did not 
directly apply). 

There was one glaring exception involy- 


*The participation of the second author was 
made possible by a United States Office of Educa- 
tion Graduate Training Grant to the University of 
Pennsylvania in Mathematics Education Research. 
The authors are indebted to I. R. Klingsburg, head 
of the mathematics department, and the partici- 
pating students at the West Philadelphia High 
School for their generous cooperation in making 
this study possible. 

*The terms “rule” and “strategy” are used 
synonymously throughout this paper. While “rule” 
is the preferred technical term (eg., Scandura, 
1968), “strategy” better connotes more complex 
multiphased rules of the sort used in this study. 
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ing extra-scope transfer. In Experiment II, 
Ss given the statement, “50 x 50,” which 
was directly applicable only to Problem 1, 
performed equally as well on Extra-Seope 
Problem 2 as did those Ss given the state~ 
ment, “n X n,” where the dimension (iy , 
variable) n was allowed to vary over the 
positive integers. This result obtained even 
though “n X n” was directly applicable to 
both Problems 1 and 2. While the study 
itself was inadequate to specify the source” 
of this transfer, a post hoc analysis of the 
experimental . treatments indicated that 
“50 x 50” was the only rule statement 
cluded in the study which was in some sense 
a restriction of a more general rule oF 
strategy. The statement, “50 x 50,” could 
be obtained from the more general state= 
ment “n x n,” by replacing the response 
determining dimension, n, by the value ov: 
More generally, it would appear that 
restricted statement may be viewed as One 
obtained by replacing response-dete: 
dimensions (see Scandura, 1966, 196 i 
1968a) in the statement of a general rule ot 
strategy with the specific values of @ Pal 
ticular instance. The authors, therefore, com” 
jectured that a restricted rule statemt 
might well provide a basis for generalizati0! 
to all problems within the scope of 


corresponding unrestricted rule. The pole 
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mary purpose of this study was to test this 
hypothesis. 

A secondary purpose was to obtain fur- 
ther information on the “consistency” hy- 
pothesis. Under certain conditions, it has 
been found that transfer to one instance of 
a rule almost invariably implies transfer 
to other instances of the rule (Scandura, 
1966, 1967a, 1967b, 1968b; Scandura et al., 
1967). As was the case with extra-scope 
transfer, however, one exception to the con- 
sistency hypothesis was obtained in the 
study by Scandura et al. (1967). The level 
of performance on one within-scope problem 
was considerably below that on the others. 
Whereas the response determining values of 
the homogeneous problems differed along a 
single dimension, the exceptional within- 
scope problem differed along a second di- 
mension as well. Taking this observation 
into account, a modified form of the con- 
sistency hypothesis was advanced. It was 
hypothesized that if transfer to one problem 
indicates that a particular rule or strategy 
(eg, “50 x 50”) has been generalized 
along one or more familiar dimensions 
(eg, to “n x n”) then transfer to addi- 
tional problems along the same dimensions 
(and within the scope of a less restrictive 
tule) should also be expected. 


Merxop 
Material 


The material was based on a variant of the num- 
er game, “NIM.” In this variant, two players 
alternately select numbers from a specified set of 
Consecutive integers (including 1) and keep a run- 
jing sum. The winner is the one who picks the 
4st number in a series having a predetermined 
Aa Each such game can be characterized by an 
at ered pair (n, m) where the corresponding value 
i n is the largest integer in the selection set and 
value of m is the predetermined sum (n and 
7 tefer to dimensions over which NIM may vary). 
- a set consists of the integers 1-6 and the sum 
unk” the players alternately select numbers 1-6 
Rul the cumulative sum is either 31 or above 
12 Which case no one wins). 
Scandura et al. (1967) presented statements of 
ree general rules by which the person making the 
et selection can always win. The specific (S) rule 
Es Sufficient for winning only (6, 31) games and 
‘a8 stated: 


In order to win the game you should make 3 
Your first selection, Then you should make selec- 


tions so that the sums corresponding to your 
selections differ by 7. : 


The specific-general (SG) rule, an unidimensional 
strategy, is applicable to any game of the form 
(6, m) and was stated: 


In order to win the game, the appropriate first 

selection is determined by dividing the desired 

sum by 7. The remainder of this division is pre- 

oat the selection which should be made 
te... 


The general (G) rule, a bidimensional strategy, is 
applicable to any (mn, m) game, where both n and 
m are allowed to vary, and was stated: 


In order to win the game the appropriate first 
selection is determined by adding one to the 
largest number in the set from which the selec- 
tions must come and dividing the desired sum 
by this result. The remainder of this division is 
precisely the selection that should be made 
first. Then you should make selections so that 
the sums corresponding to your selections differ 
by one greater than the largest number in the 
set from which the selections must come, 


Statements of restrictions of these strategies, 
which are applicable only to (6, 31) games, were 
constructed for use in this experiment. Rule 8’ was 
essentially identical to rule 8 and was stated in the 
same way. Rule SG’ was a restriction of unidimen- 
sional strategy, SG, in the sense that SG was re- 
stricted to one value (ie., 31) of the desired-sum 
(m) dimension. Rule SG’ was stated: 


The appropriate first selection is determined 
by “dividing 31 by 7. The remainder 3 should be 


your first selection. ... 


Rule G’ was a restriction of bidimensional strat- 
egy, G, in that G was restricted to one value (i.., 
31 and 6, respectively) of the desired-sum. (m) and 
size-of-selection-set (n) dimensions. Rule G’ was 
stated: 


ropriate first selection is determined by 

satay babs six, (1 + 6), and dividing 31 by 

this result, The remainder 3 of this division is the 

selection which should be made first. ... It is im- 
portant to notice that 7=6+1. 

i roduced by mimeograph 

The materials were rep! smth rea 

four treatments, 


the (6, 31) game. 
pleted (6, 31) 
game in which 
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ning sums in accordance with a specified sequence 
of selections. Knowledge of results was given on 
page 3 along with another (6, 31) practice game 
with the result of the latter given on page 4. 
Nothing was said in the introduction about game- 
winning rules, but it was mentioned that there are 
many variations of NIM. 

Three of the four treatment booklets included 
one of the restricted statements (S’, SG’, and G’) 
on page 1 together with a common (6, 31) game 
which was provided for practice. The solution to 
this (6, 31) game was given on page 2 and Ss were 
instructed to replay the same game, on page 3 
after correcting any previous errors, In this com- 
mon (6, 31) game, the running sums were 3, 5, 10, 
11, 17, 20, 24, 25, 31. The fourth booklet served as 


a control. It consisted solely of the common (6, 31) 
example with no statement of a game-winning 
rule. Nonetheless, by remembering those sums 
which corresponded to the winning selections (i.e., 
3, 10, 17, 24, 31), an S might conceivably win any 
new (6, 31) game. 

The four test booklets corresponded to the four 
treatment booklets. Page 1 was common to all 
test booklets and explained how to use the book- 
let, Hach of the successive pages (2-7) included 
one common test game together with that game- 
winning statement (8’, SG’, or G’) associated with 
the corresponding treatment booklet. This pro- 
cedure was followed to help eliminate errors due 
to recall. The “opponent’s” selections were printed 
in the booklet and § was instructed to make his 
selections and to compute the running sums. The 
first two problems, 1A and 1B, were (6, 31) games. 
Problems 2A and 2B were (6, m) games which 
differed along the desired-sum dimension, with m 
= 25 and m = 29, respectively. Problems 3A and 
8B were (n, m) games, which differed along both 
the desired-sum and size-of-selection-set dimen- 
sions, with n = 5, m = 26 and n = 7, m = 33, 
respectively. 


” 


Subjects, Design, and Procedure 


The Ss were 88 West Philadelphia High School 
students enrolled in an academic mathematics 
program. They were randomly assigned to three 
experimental groups (§’, SG’, G’) and a control 
(C) so that each group included 22 Ss, 

Each 8 completed the introductory booklet, 
one of the four treatment booklets, and the cor- 
responding test booklet, in that order. The S was 
told to read the material carefully. The experi- 
ment was self-paced and with only a few excep- 
tions Ss completed the experiment well within the 
time limit of 40 minutes. 

The criterion measure was use of the appropriate 
pattern (AP). The S was given credit for using the 
AP if he won the game and employed an appro- 
priate game-winning strategy. All of the tests 
conducted were applied to 2 X 2 contingency 

tables. When the measures were independent, the 
exact Fisher-Yates formula was used (Finney, 
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1948); when correlated, a different nonparami 
test, based on x*, was used (McNemar, 1956, 
358-359). One-tailed tests were used in conjunc’ 
with the stated hypotheses with an alpha level 
05. q 


Rusutts anD Discussion 


Table 1 shows that restricted rule stat 
ments may provide an adequate basis f 
generalization. Statements of unidim 
sional and bidimensional strategies, e 
when restricted to particular values of the 
dimensions, may result in transfer to 
problems which differ from the trainin 
problem (e.g., common example) alon 
these same dimensions. The three exp 
mental groups performed at essential 
the same level on problems 1A and 
but there were 12 Ss in Groups SQ’ 
G’, as compared to none in Group 9, 
were successful on problems 2A and 
This difference was significant at the 
level. 

A cursory review of the literature 8 


ber of other studies may also have 
volved generalizing along one or mol 
dimensions of a restricted rule statem 


viding S$ with a problem-solving strateg} 
as it applied to one problem (i.e., with 
restricted statement), improved the level of 
performance on a second problem (whie h 
was presumably within the scope of a mor 
general strategy). Some such generali 
tion mechanism may also be involved 
what some investigators have called “ 
mote transfer.” Thus, in a recent stud} 


TABLE 1 
Numper or APPROPRIATE PATTERNS 


Problem 


Note.—Abbreviated: C = control, Ss’ = 
stricted specific, SG’ = restricted specific-generaly 
G’ = restricted general. 


ee 
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Wittrock’s (1967) nonreplacement-strategy 
group was presented with a restriction of 
a general strategy which was applicable 
to his remote transfer items. Apparently, 
what these Ss actually learned (i.e., dis- 
covered) was the more general strategy.3 

The performance of the G’ Ss, however, 
suggests that transfer cannot necessarily 
be expected to all problems within the 
scope of the rule from which a restricted 
statement is derived. Of the five Ss in 
Group G’ who were successful on problems 
2A and 2B, none was successful on prob- 
lem 3A and only two, on problem 3B. 
These differences between problems 2A and 
2B and problems 3A and 3B suggest that 
the level of performance on transfer prob- 
lems may depend on the particular dimen- 
sion(s) involved. Problems 2A and 2B re- 
quired that the G’ statement be generalized 
only along the desired-sum dimension 
whereas problems 3A and 3B required gen- 
eralization along the size-of-selection-set 
dimension as well. Apparently, the G’ Ss 
were more capable of making the former 
generalization than the latter. 

The authors also feel obliged* to com- 
ment on the fact that two SG’ Ss gen- 
eralized beyond the scope of rule SG to 
problems 3A and 3B, These SG’ Ss were 


_ "Many psychologists feel that “what is learned” 
18 excess theoretical baggage since the notion must 
Invariably be defined in terms of transfer. While 
admitting the ultimate necessity of operational 
definition, the authors take the position that “what 
is leamed” is a useful construct. In particular, 
Performance on two test items (one training and 
one transfer) can often be used to identify “what 
is leaned” by individual Ss, thereby making it 
Possible to predict their performance on addi- 
tional transfer items, This latter assertion is well 
&xemplified by the present consistency data. 

A program of ongoing research by the first 
4uthor and his collaborators is aimed at uncover- 
at laws of mathematical learning and behavior 
Which hold in a deterministic (or near-determin- 
istic) sense, Thus, when exceptions occur, even 
ie the effects are not “statistically reeictonll 

are viewed as facts to be explained and no’ 
Probabilistic deviations which may be safely 
ae Although both the behaviors in question 
ae methods of approach differ greatly, the 
th ‘ors’ research objectives are quite similar to 
one “dopted Jong ago by Skinner and his fol- 
Wers—to uncover idiographic laws. 
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apparently as able to generalize along the 
size-of-selection-set dimension as were the 
two G’ Ss who were successful on problem 
3B, Thus, the statement cue, “7,” in state- 
ment SG’ was equally as helpful as the cue, 
“6 + 1,” in statement G’ even though “6” 
in the latter cue corresponded directly to 
the number of integers in the selection set. 
(The former cue, “7,” was one larger.) The 
S’ Ss, on the other hand, seemed uniformly 
unable to generalize along either dimension, 
To do so, they would have had to have 
observed that the desired sum, 31, when 
divided by the constant difference, 7, 
leaves a remainder of 3 (the first selec- 
tion). 

These observations suggest that the ease 
with which response-determining properties 
of an illustrative (training) problem can be 
related to the corresponding response-deter- 
mining value (cue) in a restricted statement 
may have an important effect on the extent 
of transfer. A pilot study conducted with 20 
highly motivated and mathematically ori- 
ented doctoral students at the University of 
Pennsylvania tends to provide further sup- 
port for this interpretation. All of the SC’ 
and G Ss and four out of five of the S’ Ss 
were able to generalize to problems 2A, 2B, 
3A, and 3B. Clearly, the ease with which a 
correspondence can be determined between 
the determining properties of an illustrative 
problem and statement cues depends on 
individual differences as well as on the na- 
ture of the cue. A major task of future re- 
search will be to determine what the im- 
portant individual differences are. — 

To test the consistency hypothesis, those 
Ss who used the AP on problems 1A, 2A, 
and 3A and those who did not (non-AP 
users) were compared as to AP use on 
problems 1B, 2B, and 3B, respectively. 
There were significantly more AP users on 
problem 1A who were AP users on problem 
1B than was the case for non-AP users on 
problem 1A (p < .001). The same relation- 
ship held for problems 2A and 2B (p < 
001) and problems 3A and 3B (p < .001), 
respectively. There were only 4 cases out of 
a total of 131 in which a non-AP user (in 
Groups S’, SG’, and G’) on an A problem 
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became an AP user on the corresponding B 
problem. There was only 1 case (out of 
67) where an AP user on an A problem was 
not an AP user on the corresponding B 
problem. 

These results suggest that if transfer ob- 
tains on one new problem, which differs 
(from the training problem) along one or 
more dimensions, then transfer may be 
expected to other problems which differ 
along these same dimensions. Of course, 
the boundary conditions for this assertion 
still need to be determined. At the very 
least, it would seem that the dimension(s) 
in question would have to be familiar to 
Ss (but just what this familiarity entails is 
not entirely clear). 
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SOME FACTORS IN CHILDREN’S LEARNING AND 
RETENTION OF CONCRETE RULES: 


ROBERT M. GAGNE anp VIRGINIA K. WIEGAND 
University of California, Berkeley 


An experiment investigated effects of (a) number of rules, i 
diate vs, 3-day recall, (c) verbal vs. verbal plus pictorial Oban 
(d) IQ, on the learning and retention of concrete rules, Each tule 
was composed of a highly pronounceable CVC as the name of a 
thing (a drawn shape), and an action (such as, “underline it”). 96 
4th-grade children were the experimental Ss. Following sessions pro- 
viding prelearning on thing concepts and review of action concepts, 
different groups of children learned 3, 5, 7, or 9 rules from printed 
booklets, reading and recording each rule once. Virtually complete 
retention was obtained for 3 and 5 rules when measured immediately, © 
significantly less for 7 and 9. Significant effects were not found for 
IQ or cueing method. After 3 days, the amount of retention was 


about 20% under all conditions. 


The learning and retention of ideas 
have usually been studied with the use of 
prose materials (eg., Briggs & Reed, 
1943; Cofer, 1941; Hall, 1955; King & 
Russell, 1966). In such investigations, the 
leamer typically first listens to or reads a 
prose passage for a specified number of 
trials, and later is asked to recall or 
Tecognize certain selected ideas embodied 
in the passage. Such an approach gen- 
erally assumes that what is learned and 
tetained will be influenced by the mean- 
ingful context within which the particular 
ideas to be recalled are imbedded. The 
context may be presumed to provide cues, 
8§ appears to be true with paired as- 
Sociates (McGovern, 1964); it may be a 
Source of mediators (Davidson, 1964); it 
an Provide “organizers” of the sort 
ie lied by Ausubel (1962) ; or it may have 
still other effects, as yet unidentified. 

In contrast to this context-imbedded ap- 
ie ge the present authors were inter- 
ree _m taking an analytic view of the 
ater of ideas, one which at the outset 

€mpts to separate the effects of context 


Variables from others. This investigation 
en, 


a 
eiihe, Tesearch reported herein was performed 
Bune to a contract with the United States 
Office ou of Health, Education, and Welfare, 
yan of Education. Thanks are due to William 
of the ReuiPal, and to the fourth-grade teachers 
ee uena Vista School, Walnut Creek, Cali- 
Slo. “Sally Collins, Carol, Richman, Arthur 

ane, and Jean Wagner. 


was with a specification of the 
concept of “idea” that was as definite as 
possible. In pursuit of the latter goal, re- 
search of previous writers and theorists 
provided scant information. Accordingly, 
in order to proceed with this account, the 
authors had to propose some of their own 
definitions. These are tentative, subject to 
further development, and are given here 
for the purpose of aiding communication. 
For the concept of “idea” itself, the au- 
thors refer to Gagné (1965), who considers 
an idea to be a principle, or rule, The 
latter is a capability inferred as having 
been learned, which enables the individual 
to demonstrate a sequence of concepts such 
as “If A, then B.” The concepts which 
make up a rule’ are (simpler) capabilities, 
evidenced in overt behavior by the iden- 
tifying of classes of objects or events, In- 
cluded as concepts are classes of the in- 
dividual’s own actions. A rule may, of 
course, include a rather large number of 
concepts in its sequence. However, the 
simplest rule in a formal sense is one that 
contains only two concepts. A common 
variety is a rule combining a “thing con- 
cept” with an “action concept,” as in 
“birds fly,” “water wets,” and “Tf green, 
0.” 
A rule may be concrete or abstract, de- 
pending upon whether its component con- 
cepts are concrete or abstract. If the two 
components of a rule may be represented 
by a concrete noun and a verb describing 
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overt action, the rule is concrete, A par- 
tially abstract rule combines a concept 
represented by a concrete noun with a con- 
cept represented by an abstract verb, as 
exemplified by “women suffer” or “snakes 
live.” The reverse combination, abstract 
noun-conerete verb, may also be called 
partially abstract. There are also fully 
abstract rules, like “power corrupts.” The 
authors believe these distinctions are of 
some importance to the study of rules. 
However, no more extensive discussion of 
them will be undertaken here; they are 
mentioned in order to convey the meaning 
of “concrete rule.” 

The present study had the purpose of 
investigating some factors with potential 
influence on the learning and retention of 
concrete rules, of the sort sometimes called 
facts. The major variable of interest was 
the number of such rules to be learned in a 
single set, when presented one after an- 
other, Another variable was the manner of 
cueing used in eliciting the rule during a 
test of retention. Both immediate and de- 
layed (3-day) retention were measured. 

It is of some interest to note that the 
amount of retention of learned ideas 
varies over a wide range, as reported in 
different studies. It is also true that varia- 
tions in procedure are many, making com- 
parisons among studies difficult or impos- 
sible. An early study by Yoakam (1921), 
for example, found retention of a prose 
passage read a single time to be from 
48% to 80% as complete after 20 days as 
retention measured immediately, in differ- 
ent groups of children. In the study of 
Dietze and Jones (1931) , & single reading 
of a 1,000-word article yielded scores 
ranging from 44% to 84% for immediate 
retention, and from 26% to 42% after 30 
days. Newman (1939) reports 87% reten- 
tion after 8 hours of sleep, 86% after 8 
hours of waking, of 12 main ideas con- 
tained in a 300-word story. Comparable 
scores were 47% after sleep and 23% after 

waking for 12 nonessential ideas. In Cofer’s 
(1941) study of the learning of Prose pas- 
Sages, no evidence of retention was found 
after 9 months, as measured by Telearning 
scores, in Ss who learned the material for 
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logical (as opposed to verbatim) Teproduc- 
tion. 


RATIONALE FOR AN ANALYTIC Srupy 


It is assumed that a single concrete Tule, 
whose component concepts are readily 
available to recall, is learned in a single 
trial. If one then undertakes to learn two 
or more such rules in a row, the possi- 
bility exists that they will not exhibit per- 
fect immediate retention. How many such 
rules can be learned (and retained) when 
presented one after another? What is the 
effect of the variable of number of concrete 
rules on the retention of such rules as in- 
dicated both by a test immediately follow- 
ing learning and by a test given 3 days 
later? These are the questions addressed in 
this study. 

Specifically, fourth-grade children who 
had previously learned thing concepts, and 
recalled some familiar action concepts, 
were given concrete rules to learn. Hach 
Tule combined a thing (a drawn shape) 
with an action (such as underlining the 
shape). In designing the thing and action 
concepts making up each rule, the factor of 
familiarity was considered. Since retention 
of the rules was to be cued by verbal 
names, it was desired to keep them equally 
unfamiliar, and thus to avoid the possible 
differential effects of mediating verbal as- 
sociations. Accordingly, nonword names 
were learned for the things. For action 
concepts, however, since these were to be 
recalled, no necessity was seen to insure 
that they were unfamiliar. Instead, they 
were chosen to be highly familiar and 
highly overlearned. 

The children learned these rules he 
the pages of booklets, each page of bie 
gave a printed statement of the rule am : 
then asked the learner to draw it. Different 
groups of children learned different num 
bers of rules: three, five, seven, OF i 
Retention of the rules was then a8 
immediately; and in separate groups, ir 
3 days. For the purpose of investiga na 
cues to recall, half of the retention si 
lets cued the rules by means of on 
verbal name of the thing concept, hal i 
the verbal name plus a drawing © 
thing concept. 
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Subjects 


The learners in the study were 96 fourth-grade 
children in an elementary school located in a 
primarily middle-class suburban community. With 
the cooperation of their four teachers, the experi- 
ment was conducted as an exercise in science in- 
struction, in which the children were informed 
participants. The study began with 104 children. 
The rules of random assignment were followed 
throughout, with the exception that two students 
were excluded because they did not complete the 
prelearning. With absences and other contingencies 
taken care of, there was a total N of 96 for the 
study, 


Materials 


Prelearning of thing concepts. Four different 
study sheets were prepared to accomplish the pre- 
learning of thing concepts, Each sheet showed a 
thing (shape) with its printed name (nonsense 
syllable) underneath it in nine boxes having a 
Scattered arrangement on the page. Ten different 
test sheets were also made up, containing nine 
boxes running vertically, each with a printed con- 
cept name and a space for the shape to be drawn. 

The concept names, typed in capital letters, 
Were nine consonant-vowel-consonant (CVC) ayl- 
lables with relatively high pronounceability values, 
averaging 2.49, as given by Underwood and Schuls 
(1960). No consonants were used in common in 
either the first or third position, and vowels were 
balanced in frequency as nearly as possible (sor, 
FAC, SUM, LAR, NOP, REL, SUD, TIS, ZIN). The figures 
Were chosen to be distinctive and easy to draw; 
they included such common figures ag a square, a 
gumele, 8 heart, and a crescent. In successive study 

ets, the figures were shown in four different 
pee from the top to the bottom of the page. 
ih addition, considerable variation was introduced 
into nonrelevant features such as size and thickness 
of outline, without changing the essential shape. 

Similar plan was followed for test sheets. 
to Review of action concepts. A booklet was used 
fi prea nine different actions, Each action was 
eh shown accompanied by printed instructions 
or ch were read by E (i.e., “There is a line under 
wee The child then executed the action 
e @ second figure of the same kind on the same 
Bee and again with another figure on the next 

*ge. The figures employed in this booklet were 
attine’ {0m those used in learning. The nine 
Tae Tepresented were: line under, tail on, cir- 
aa hee question mark after, dot in, check be- 
ours over, X through, and legs on. The booklet 
four ed a final test for all nine actions, presented 
Matinee Page and five on the next, with printed 
for ectons (“Draw a circle around the shape.”) 
ee action. 
were learning. The booklets for rule learning 
vidual Peene to be used by each child in indi- 

learning. Each page first identified a thing 
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concept (“This is a Nop.”). Then the statement of 
the rule was made: “Rule: A Nop has a circle 
around it.” The rule was then illustrated. Finally, 
the printed instructions said, “Draw the rule for a 
nop,” and a blank space was provided where this 
was to be done. To meet design requirements, a 
fourth of the booklets presented three rules; a 
fourth, five; a fourth, seven; and a fourth, nine. 
i Each thing concept, and each thing-action com- 
bination, was made to have an approximately 
equal frequency of occurrence in booklets for each 
of the treatment groups pertaining to number of 
tules (three, five, seven, and nine). Within this 
constraint, there was random assignment of rule 
booklets among Ss within each of these groups. 

Filler activities. Several booklets were prepared 
containing filler activities, to be used when each 
learner finished his learning booklet. These were 
necessary in view of the fact that the number of 
Tules presented in learning was as few as three for 
some Ss, and as many as nine for others, The 
filler booklets contained both geometrical and 
numerical puzzles, 

Retention. Booklets used to measure retention 
contained 1 page devoted to each rule, The four 
varieties of booklets were 3, 5, 7, and 9 pages long, 
to correspond with the number of rules learned by 
different groups of Ss. Each page contained a 
printed statement of the form, “Draw the rule for 
a nop.” To permit the investigation of the cueing 
variable, half of the booklets included a drawn 
figure of each thing concept just below its name. 


Design 

The Ss were first divided into six subgroups 
representing six levels of 1Q, as measured by. tra- 
ditional group test scores. In these Ss, this variable 
had a median of 110, a range of 86-149, and a 
roughly normal distribution. The Ss within each 
IQ level were then assigned randomly to each of 
the 16 treatment conditions of the experiment. At 
the beginning of the study, it was possible to as- 
sign seven or eight to each condition. After attri- 
tional factors had taken their toll, the assignment 
of Ss to conditions ended up as shown in Table 1, 
achieved by the final discarding of data from only 
three Ss, chosen by a random process. 


TABLE 1 
Numper or Sussects AssiGnep To TREATMENT 
Crtts REPRESENTING THE EXPERIMENTAL 
ConpITIONS 
a 


Number of rules 


‘ a 
Retdtion | condition aga 
Immediate | Verbal 6/6) 6) 6 
Verbal+ | 6 | 6 | 6 | 6 
Pictorial) | | a | 
Delayed Verbal 

a Verbal+ | 6 | 6 | 6 | 6 

Pictorial 
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It will be seen that the design is a four factorial 
one, making possible the determination of the ef- 
fects of four different variables: (a) number of 
rules learned; (b) immediate versus 3-day Teten- 
tion; (c) verbal versus verbal plus pictorial cueing 
of recall; and (d) IQ level. 


Procedure 


The schedule for the study was as follows: Days 
1-4, prelearning of names of thing concepts; Day 
5, review of action concepts; Day 6, learning of 
tules, followed by retention test for half of the 
total group; Day 7 (preceded by an interval of 72 
hours), retention test for the other half of the 
group. Hach day’s session occupied approximately 
a Ye-hour period. [ 

The prelearning of the names of thing concepts 
(shapes) began with a session in which F gave a 
general introduction to the study. Following this, 
E drew each shape on the blackboard and placed 
beside it its printed name, When all nine shapes 
had been introduced in this way, study sheets were 
passed out; to all the children. They were asked to 
learn the names of all nine shapes. When a child 
thought he knew them all, he asked for a test 
sheet, and turned in his study sheet. When this in 
turn was turned in, he received another study 
sheet, followed by a second test sheet. Two study- 
test trials of this sort were conducted during each 
session for 4 days. In addition to the introductory 
trial, each S thus had a total of eight such trials. 
On the second test given on the fourth day, only 
two of the children did not identify all of the 
names correctly; data from these children were 
not used in the experiment. The criterion of the 
prelearning of thing concepts was thus a conjunc- 

_ tive one: completing eight study trials and identi- 
fying all nine concepts correctly on the final test. 

On the day following the prelearning of thing 
concepts, the review and test session was conducted 
for the nine action concepts. Following a brief 
introduction, these trials were administered by 
means of the booklet previously described. All the 
children knew the action concepts after the first 
trial (and most probably before it). 

_ Whereas prelearning and the action concept re- 
view were conducted in the four regular class- 
rooms, the children were assembled in one large 
room for the rule-learning and retention sessions, 
Their places at the tables in this room were 
determined before their assembly, and booklets per- 
taining to the respective subgroups of the experi- 
ment were distributed at these places. In the learn- 
ing session, the children were first instructed to 
work through the booklets page by page, and when 
finished, to go on to the second booklet. They 
were also asked to record the time of starting each 
booklet. In this manner, once the trials for tule 
learning were finished, the learners went on to the 
other activities contained in the filler booklets, 
regardless of whether they had studied three, five, 
seven, or nine rules. The session was brought to an 
end when it was observed that all children having 
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booklets containing nine rules had finished. Each 
tule the child encountered was first presented in 
printed form and illustrated, and then drawn by 
him on the lower part of the same page. The mean 
time per rule (each rule being composed of thege 
two steps of observing and recording) was found to 
be .33 minutes, in the three-rule group; 23 minutes 
in the five-rule group; .22 minutes in the seyen- 
tule group; and 21 minutes in the nine-rule group. 

For half of the total group, distributed equally 
in relation to the other variables of the experi- 
ment, the second booklet was a test of immediate 
retention, whereas for the other half, the second 
booklet contained filler activities. Retention book- 
lets began with no preliminary instructions, simply 
containing a statement like “Draw the rule for a 
zIN” on each page, as previously described. Those 
who finished these booklets before others went on 
to a third booklet of filler activities. 

Delayed retention was measured with the other 
half of the total group on a Monday, following a 
weekend, 3 days after the rule-learning session. 
Tn its essentials, the same procedure was followed 
as for immediate retention. It may be noted that 
these children had not been told on Friday that 
they would be asked to recall the rules they had 
learned. 


RESULTS 


The raw data of the experiment were 
number of rules recalled, under the various 
conditions described in Table 1, and for 
six IQ levels within each of these condi- 
tions. An examination was also made of 
total overt errors, as well as of various 
categories of overt errors. Insofar as sufi- 
cient data existed, trends in errors appeared 
similar to those of correct responses; there- 
fore, no more detailed analysis was undet- 
taken. 

The data first were converted to pet- 
centage remembered. An analysis of vari- 
ance of these data revealed a significant 
effect of the immediate versus delayed re 
tention variable (F = 147, df = 1/15, ? < 
.01). Nonsignificant differences were foun! 
for the variable of IQ; and also for the 
variable of verbal versus verbal-plus- 
pictorial cueing. The analysis did not Te 
veal differences for the variable of ere 
of rules learned; for the data as who! 0 
the percentage of retention was not ey 
nificantly different whether three, fV iy 
seven, or nine rules had been learned ip A 
23, df = 3/15, p > .05). In addition, ™ 
significant interactions were found. 


Learnine anp Rerention or Concrete Routes 


Cueing effects 

The extra cue to recall of the correct 
rule, namely, the provision of a picture of 
the thing concept rather than only its name, 
js not shown by these results to have aided 
recall of the rule (F <1, df = 1/15, 
p > .05). Thus, the group of children who 
were cued to recall only by the name of the 
thing concept contained in it, had no more 
difficulty in recalling the rule than did 
children who were cued by a picture plus 
the name. These results may also be con- 
sidered as evidence that the children were 
able to use the name effectively, and that 
their performance in recalling rules was 
not affected by inability to recall the names 
per se. 


Number of rules learned 


The mean percentage recalled is shown 
(for the verbal and verbal-plus-pictorial 
groups combined) in Figure 1, for three, 
five, seven, and nine rules, under conditions 
of immediate and 3-day retention. It ap- 
pears that three and five concrete rules 
are learned virtually intact, when recall is 
Measured immediately. This near-perfect 
Performance appears to drop off, however, 
when seven rules are learned, and to 
undergo a further drop for nine rules. The 
Special circumstance to be considered here 
1s that, as compared with these mean 
Pes, the performance of the group which 
be five rules is 100%, with an SD of 
, Tt would seem legitimate to pose the 
(uestion as to whether the means for the 
Seven-rule and the nine-rule group are sig- 
Nificantly different from a perfect per- 
ne: The usual statistical tests, how- 
*t, are not designed to deal with measures 
aving zero variance. Accordingly, the fol- 
“tl test was applied. 
Bite Standard error of the mean for the 
tls which learned seven rules was 7.6, 
ac & confidence interval at the 99% 
a Oe es to the score 98.9. The mean 
dif 8 for this group is thus significantly 
: ‘ent from 100, the score attained by 
lan} mae which learned five rules. Simi- 
mie. or the group which learned nine 
inte, tt? Upper limit of the confidence 
‘val is 77.9, and by similar reasoning 


Per cent remembered 
Seto SiS S 
iE 


7 
Number of principles 
4 Fic. 1. Percentage of principles remembered 
immediately following learning (I) and after 3 
days (D), by groups who were given different num- 
bers of principles (three, five, seven, and nine) to 
learn. 


it may be said that the mean of this 
group (63.9) differs significantly from the 
perfect performance of the five-rule group. 
A t test of the difference between the imme- 
diate retention scores of the seven-rule and 
nine-rule groups indicated nonsignificance 
at the .05 level. 

As for percentage recalled after 3 days, 
this tends to be around 20%, and shows 
only slight and insignificant change de- 
pending upon the number of rules learned. 


Discussion 


There appear to be some useful facts in 
the results of this experiment. When both 
thing concepts and action concepts are 
previously well learned, fourth-grade chil- 
dren can learn up to five concrete rules by 
reading and recording each once, one after 
another, when they are studied at an 
average rate of one every 15 seconds. When 
retention of seven and nine such rules is 
measured immediately, however, losses of 
the order of 20% and 36%, respectively, 
are found. ; 

The dropping off of immediate retention 
scores under these conditions would be ex- 
pected to occur as & result of interference 
(ef. Keppel, 1968), both proactive and 
retroactive. This experiment, of course, 
provides no direct confirmation that in- 
terference, rather than some other factor, 
is at work. Such a conclusion would need 
to be based upon an additional study in 
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which, for example, conditions of learning 
were arranged in order to contrast the ef- 
fects of variables favoring interference 
with variables not favoring it. Differing 
degrees of similarity among the elements 
of the rules to be learned might be a 
variable feasible to investigate with this 
theoretical purpose in mind. So far as the 
present results are concerned, interference 
is merely a likely cause of the observed 
relation between number of rules pre- 
sented and percentage retained imme- 
diately. 

The results also suggest another kind of 
possible relationship with theory. Post- 
man (1964) points out the relationship of 
the immediate memory span to the initial 
events of memory for items to be repro- 
duced serially. Jensen (1964) found a 
high degree of relationship between meas- 
ures of immediate memory span and the 
learning of serial lists of verbal items. Al- 
though the rules of the present experiment 
were not to be learned in serial order, they 
were presented serially. It may be of some 
significance that the falling off of im- 
mediate retention scores occurs in the in- 
terval (five-seven items) which has often 
been measured as the limit of memory 
span in children of the age used in this 
experiment. 

As for the amount retained after a 3- 
day delay, it is most interesting that this 
was no more than 20%. It should be borne 
in mind that several obvious factors favor 
this relatively low degree of retention. 
First, the children were given no particular 
incentive to remember, and it therefore 
seems likely that they undertook little re- 
hersal during the period intervening be- 
tween learning and recall. Second, the 
facts they were asked to learn and recall 
were isolated from each other and from any 
other meaningful context (except the most 
general one that they were learning rules 
containing new names for things). Under 
these circumstances, the 80% loss in re- 

tention appears to be the order of what 
may be expected. However, it is a high de- 
gree of loss when compared to that ob- 
tained in studies in which the ideas are 
imbedded in a context (eg, Yoakam, 
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1921; Dietze & Jones, 1931; English, Wel- 
born, & Killian, 1934). 

The apparent contrast between the re- 
membering of isolated versus context-im- 
bedded ideas would appear to suggest 
one promising direction for additional 
study. On the one hand, such research 
needs to be related to current theoretical 
formulations of verbatim learning which 
emphasize the role of contextual stimuli 
(ef. Melton, 1967). On the other, there 
exists the possibility of further experi- 
mental exploration of different kinds of 
contexts, such as the “subsuming” and 
“correlative” types emphasized in Ausu- 
bel’s (1967) theory. It seems likely that 
continuing to focus attention on the in- 
dividual “fact” or “rule” will be a useful 
strategy in such research. 
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EFFECTS OF ORAL AND ECHOIC RESPONSES IN 


BEGINNING READING’ 


MARY H. NEVILLE 
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96 Grade-1 children in 3 groups, at each of 2 learning aptitude levels, 
were compared to test the hypothesis: Giving an echoic or oral re- 
sponse before silent reading will, by encouraging the application of 
intonation patterns to beginners’ reading, improve achievement and 
also reduce vocalization. 4 classroom teachers taught a preprimer 
vocabulary; EZ then taught the preprimer text to the groups, giving 
rigid silent, oral, or echoic training before silent reading. Ss were 
tested for reading achievement and vocalization after each of the 3 
preprimers. Analysis of variance indicated that echoic groups read 
more fluently and that echoic and oral training reduced vocalization. 
No significant group differences were found for word recognition and 


identification, or comprehension. 


The view of reading as a code for the 
spoken language leads linguists to insist 
that the “first task of learning to read... 
is to teach the child to decode the written 
language to its language equivalents” 
(Leyin, 1966, p. 140). By this they mean 
that all the language signals must be pres- 
ent in reading, including the paralinguistic 
intonation patterns. Since it has been 
hypothesized that intonation patterns help 
to show the connectedness and completeness 
of groups of words (Bolinger, 1957) and 
aid complex language learning (Braine, 
1963), both Lloyd (1964) and Gliessman 
(1959) have suggested that intonation pat- 
terns, applied in reading, may also help 
the beginner unite singly identified words 
into units of thought. These hypotheses 
are supported by the relationship shown 
between the ability to apply normal in- 
tonation patterns to reading, and reading 
comprehension (Dearborn, Johnston, & 
Carmichael, 1949). 

Linguists have given little guidance on 
how to teach the child the correspondences 
between written symbols and spoken lan- 
guage (Levin, 1966). Reading aloud should 
be helpful. It utilizes the secondary rein- 
forcement properties of speech, makes pos- 


* The data on which this paper is based were in- 
cluded in the author’s dissertation, presented in 
partial fulfillment for the requirements of the 
PhD degree at the University of Calgary. The 
author is indebted to Ethel M. King, the super- 
visor for the study. 
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sible extrinsic reinforcement, and, due to 
the learner’s self-correction tendencies, en- 
ables him to bring his reading closer to the 
normal sounds of speech (Skinner, 1957). 
The association of meaning with the 
written words is also facilitated by an oral 
response (Hildreth, 1955; Osgood, Suci, & 
Tannenbaum, 1957) and, after practice, 
this meaning comes to be directly related 
to the written form. As pronunciation of 
words becomes superfluous, vocalization 
during silent reading should decrease and, 
by the process of cue reduction, silent 
reading will gradually develop (Anderson 
& Dearborn, 1952). Ne 
Through language learning, beginning 
readers have become accomplished in 
echoic behavior (Mowrer, 1960; Skinner, 
1957) which they use to short-circuit the 
slower progressive approximation method 
of earlier language learning. Accurate echo- 
ing of speech has also acquired secondary 
reinforcing properties. The “look and say 
reading method makes use of this a 
complishment for teaching single wore 
and could be extended to the teschitee 
larger language structures. If a model : 
reading were given, the child’s oral, APS 
response would then become an ect 
sponse which should be more efficient a 
the simple oral response for teaching *”” 
application of intonation patterns to rea 
ing. ee 
The ability to organize singly identified 
words into a complete sentence should # 


| 
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prove several aspects of reading. Fluent 
oral reading, defined (Harris, 1961) as 
reading where the words of the sentence 
are grouped in meaningful structural units 
indicated by appropriate pauses, stress, and 
inflections of voice, should occur. Reading 
comprehension should then improve and, 
with it, an increased use of context which 
should assist the recognition and identifi- 
cation of individual words (McCullough, 
1958). With overt-response teaching meth- 
ods, vocalization during silent reading 
should decline more quickly than when all 
responses to written words are silent. 

In their study of overt and covert re- 
sponses in reading, McNeil and Keislar 
(1963) suggest that the effects of an oral 
Tesponse may vary according to the learn- 
ing aptitude of Ss. Sex differences were 
also noted in an earlier study (Neville, 
1965) for children who listened to a pas- 
sage before reading it aloud. 

_ The purpose of the present study was to 
Investigate the effect of practicing an oral 
tesponse before silent reading, or practicing 
an echoic response before silent reading, on 
the reading criteria of word recognition 
and identification, comprehension, fluency, 
and vocalization and to compare the effects 
of the oral and echoie responses at two 
learning-aptitude levels and for each sex. 

hypothesis was: Practicing an echoic 
Tesponse before silent reading would give 
the best performance on all criteria while 
Practicing an oral response before silent 
Teading would give a better performance 


" no overt practice before silent read- 
g. 


MertHop 


ek experiment was performed in one large 
fe with four complete Grade-1 classes in a 
Athouacdle/upper-lower class area of Calgary. 
§ ‘ough all 110 Grade-1 pupils were taught in the 
ey scatal classes, 6 of them were not consid- 
a as Ss because they were either repeating the 

®, could not speak English, or were outside the 
————— 


2 
8u The author would like to thank R. Warren, 
prnuttendent, Calgary Public Schools, the school 
Winged, M. Gutiw, and the assistant, principal, 
this Aaah ag for their aw stpne by io 
rch. She is especially he. four 
Grade-1 teachers, Sibyl Faid, Marian Hansen, 
"tg. Peters, and Christina Roberts. 
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chronological-age range of from 5 years 8 months 
to 6 years 8 months. No child had learned to read 
before entering school. 

During the first 3 weeks of school, three tests 
Predictive of reading success’ were administered. 
As the predictive values for the three tests are very 
similar (Olson, 1958) and the score distributions 
were close to normal, scores were converted to z 
scores and added together to form, for each Sia 
composite score. Two groups of equal size were 
formed by assigning Ss above and below the com- 
Posite score median to upper and lower learning- 
aptitude levels. 

For each level, Ss were randomly assigned to 
three treatment groups per sex, The Ss in each of 
these six groups at each learning level were then 
randomly assigned to two classes. Thus, at each 
learning level there were two classes, each with 
three equal-sized treatment groups. The three 
treatment groups, combined over the two classes 
of a level, were comparable with regard to sex dis- 
tribution. The teachers most suited to teach a par- 
ticular level were chosen by the assistant princi- 
pal: At each level the two teachers were randomly 
assigned to classes, All four teachers had had 2 
or more years’ experience in teaching Grade 1, 

The normal reading texts (Reading for Mean- 
ing Readers, McKee, et al., 1957) and program 
used by the Calgary Public School Board were 
followed in the study. All Ss had worked through 
the same prereading program and continued then, 
in the fifth week of school, into the associated pre- 
primer reading material. In the upper-learning- 
level classes, the classroom teachers taught the 
entire vocabulary of the first preprimer at an aver- 
age rate of 2 words a day, this rate being increased 
to 2.5 words a day for the two subsequent pre- 
primers. The two lower-learning-level classes took 
one-third as much time again to learn the same 
words for each preprimer. To present the words 
the teachers used the usual “look and say” method 
outlined in detail in the teacher’s manual but the 
amount of context used in presenting each word 
was reduced to one very short sentence. At no time 
in the classroom was any other connected material 
read by Ss. 

‘After the preprimer vocabulary had been taught 
by the teacher, Z taught the reading of each pre- 
primer for 10 consecutive school days, keeping the 
time constant over levels. Reading was taught 
both morning and afternoon for all Ss, 3 pages of 
the 60-page preprimer being presented per session. 

At each learning level, Ss of the same treatment 
group were combined over the two classes and 
taught, or trained, together in a room near the 
Grade-1 classrooms. There were 17 or 18 Ss in each 
group plus 1 “non-8.” While a group was absent 
from the classroom the teacher continued with 


*Pintner-Cunningham Primary General Abil- 
ity Toate Verbal Series, Form A; Harrison-Stroud 
Reading Readiness Profiles, Auditory Subtests 4 
and 5; Murphy-Durrell Reading Readiness Analy- 


sis, learning rate subtest. 
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work other than the presentation of preprimer vo- 
cabulary. 


Training 

Three methods followed by Z as she taught the 
reading of the three preprimers constituted the 
treatments: silent (no oral response before silent 
reading), oral (oral response before silent read- 
ing), and echoic (echoic response before silent 
reading). The silent training approximated the 
method of the preprimer guidebook, and so the 
silent group could be considered as a control 
group. Training for any treatment did not vary 
between pages, preprimers, or levels. Each group 
followed the same treatment for each preprimer. 

The invariable procedure for each page of 
reading (outlined in Table 1) was as follows: Each 
page was introduced for all groups by picture read- 
ing. The children’s books were turned face down 
on, their knees and only the picture in E’s book, 
with the words masked, was discussed. The E al- 
ways asked, “Who is in the big picture? What is 
doing?” Books were then turned up and 
reading began. 

Before the first presentation of the page to the 
silent group and the third presentations to the 
oral and echoic groups (all silent reading—Table 
1), F asked the same comprehension question to 
guide the reading. She then added, “Read with 
your eyes and do not move your lips.” 

For the second, oral presentation for the oral 
and echoic groups, Z said, “Read out loud and 
make your reading sound just like quiet talking.” 
For the second, silent presentation for the silent 
group they were told, “Read it again, silently. Do 
not move your lips, While you read, think how it 
would sound if it were just like talking.” 

For the first and third presentations, respec- 
tively, the echoic and silent groups were told, 
“Look at the first word on the page. Now listen 
carefully to me (or child’s name in the silent 
group) and watch the words in the book. (Story 
character) is talking.” The EF said for the first, 
oral presentation of the oral group, “Read the 
story out loud. (Story character) is talking.” 

The Sg remained seated in front of E during all 
oral reading. They read with normal speech yol- 


TABLE 1 
Treatment Metuops Usep to Trach 1 Pace 
or Reapine 
a 
Presenta-. Method 
thom off 
in Silent Oral Echoic 
First Silent reading | Oral response | Listen to model 
read by E 
Second | Silent reading | Oral response | Echoic response 


Sens Sie te [_—_—____ 

Third Listen. et oral | Silent reading | Silent reading] | 
yyone 

8 of group 
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ume but at differing rates so that reading was not 
in unison. No distraction of any one child by the 
voices of the others was observed. 

In all groups, if an S did not know a word, F 
told it to him. Words giving general difficulty were 
told to all groups and written on the blackboard; 

Since all the words were taught before Ss had 
their experimental training, if an S had been ab- 
sent not more than 5 consecutive school days he 
continued on his return with his treatment group 
and the missed practice was made up in individual 
sessions. Such a procedure was unavoidable since 
two-thirds of the Ss were absent at some time 
during the 60 group-training sessions. 


Testing 


At the end of each preprimer every S was tested 
individually by an E-constructed test passage per- 
taining to the preprimer. Each of the three tests 
consisted of a story using all the words of the par- 
ticular preprimer, as well as other words from 
earlier preprimer vocabularies plus several new 
words, The only picture was of the preprimer char- 
acter who was “talking” in the story. Ten asso- 
ciated comprehension questions were also con- 
structed for each story, five to test simple recall 
and five to test inferred understanding. The test 
passage, typed on quarto paper with a primary 
typewriter, was placed on a specially constructed 
reading stand on a child-sized table. Attached to 
the front of the stand was a rigid section on which 
8 rested his chin, This section contained a very 
sensitive microphone, concealed just below a net- 
covered aperture, and connected to a Uher 4000 
Report-L tape recorder which recorded voice 
sounds. No child indicated that he suspected the 
presence of a microphone. ‘ 

Each S, taken in random order, sat in front of 
the stand with Z beside him. She explained that 
8 was to read a story and that after it had been 
read, questions would be asked about it, The 8 
was asked to read silently, and told to read with 
his eyes and not move his lips. No words were 
identified for S during this reading. The compre 
hension questions were then asked orally and the 
answers recorded verbatim. 4 

The HZ next asked S to read the story peel 
this time out loud so that his reading would sous 
“Gust like talking.” As S read orally, words that he 
did not recognize or identify were told to se 
after a pause of approximately 5 seconds and a 
errors recorded after the scoring method of t 
Gray Oral Reading Test (Gray, 1963). This testing 
procedure was followed for each of the three pre- 
primer tests. wahiene 

The Grade-1 teachers marked the comp », 
sion questions using model answers prepare! the 
E on the basis of a trial of the material as 
previous school year. The names on the Te ot” 
sheets were concealed. Three “number ori 
scores were obtained for each test: literal ‘ata 
prehension, inference comprehension, 40! 
comprehension scores. : a by 2 

When any speech sounds were discerne 
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on the silent reading recordings, the tapes were 
listened to by H and two other raters. Vocalized 
words were totaled over the test passage and aver- 
aged over the three raters to obtain vocalization 
score. Interscorer reliability based on the average 
intercorrelation among the three sets of ratings 
was 48, 

The F rated all the oral reading recordings for 
fluency, as defined in the introduction, and ob- 
tained a score of the total number of nonfluent 
sentences of more than one word. Because Ss had 
had such limited reading experience, pauses for 
word identification, as well as word substitutions 
or additions, were ignored when rating. As a check 
on the reliability of H’s rating, two specialists in 
primary reading also listened to the reading of 20 
8s chosen randomly from all groups, for each of 
the three test passages. Interrater reliability for 
nonfluent sentences over the 60 passages, based on 
the average intercorrelation among the three raters, 
was 45, 

The word-recognition error score was the num- 
ber of words marked in each error category; the 
word-identification score was the number of new 
words read correctly in the passage. 

At the end of the third preprimer testing, the 
teachers resumed the normal classroom teaching 
of reading. Most children were ready to start the 
first primer of the reader series. 

At each level, 8 weeks after the end of the ex- 
Periment, Ss were posttested with the Primary 
Reading Profiles, Level I, a group reading-achieve- 
ment test edited by the senior author of the text- 
books used in the experiment. Progress in compre- 
hension, word identification from context, and 
word recognition were measured; fluency and yo- 
calization were not retested. 


RESULTS 


A three-factor, Lindquist Type III (Lin- 
Quist, 1953) analysis of variance was used 
for the main analyses. The between factors 
for the analysis of each dependent variable 
Were treatment groups and learning-apti- 
tude levels, The three preprimer tests con- 
stituted the within factor. In a second anal- 
Ysis of each dependent variable the two be- 
tween factors were treatment groups and 
nae differences, the within factor, as before, 
being the three tests. 

The posttest data for word recognition, 
Word identification, and comprehension 
Were analyzed separately by two-factor 
analyses of treatment group by learning- 
‘ptitude level, and treatment group by sex 

ifference, 

Some Ss were lost during the experiment 
ind others were randomly rejected to give 

in each treatment group at each level 
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for preprimer and posttest data. For the 
analysis by sex difference, three more Ss 
were rejected so that each treatment group 
contained 17 girls and 14 boys, 


Comparison of Treatments by Learning 
Levels 


Means and standard deviations for all 
the preprimer test variables, averaged over 
the three preprimer tests, are given in 
Table 2. Table 3 shows the levels of sig- 
nificance for the Type III analyses of 
variance for each of these variables, 

Tests of homogeneity of variance showed 
a marked heterogeneity of variance for 
vocalization only (between treatment 
groups). Accordingly, following the recom- 
mendation of Norton’s study (Lindquist, 
1953, p. 86), the required level of signifi- 
cance for this one test was raised to 025 
(as opposed to .05 for all other tests) and a 
t test for unequal variances was used, 

The significant main effect for perform- 
ance over the three preprimer tests was 
shown by ¢ tests to be in the expected di- 
rection for word-recognition errors and 
nonfluent sentences. The significant effect 
for all comprehension was due to a decline 
in performance over tests: For word 
identification, the effect was caused by 
higher performance on the second pre- 
primer test in comparison with the other 
two. 

The learning-levels effect for word iden- 
tification, literal comprehension, and flu- 
ency was in the expected direction. 

The significant treatments effect for 
vocalization was shown by ¢ tests to be 
due to superior performance of the oral 
group and echoic group compared to the 
silent group. The significant Tests x Treat- 
ments interaction for nonfluent sentences 
was caused by a significantly better per- 
formance of the echoic group in relation to 
the silent group on the third preprimer test. 

Thus the first hypothesis of superiority 
of the oral response practice over no overt 
practice was accepted only for vocaliza- 
tion. The hypothesis of superiority of 
echoic practice over both oral response 
practice and no overt practice wae not ac- 
cepted for any dependent variable, but for 
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TABLE 2 . ‘ 
~ Mpans-anp STanpaRD DEvIATIONS FoR THREE TREATMENT Groups at Two LEARNING 


LEVELS FOR ALL PREPRIMER TESTS a 
Treatment ae 
Test es Silent Oral Echoic 
Mu SD Mu SD Mu SD 
Word recognition( errors) | Upper 17.75 14.28 14.85 15.79 13.48 10.85 
Lower 13.96 17.79 16.00 11.52 17.06 12.55 
Word identification Upper 1.37 1.19 1.70 1,27 1.59 1,02 
Lower 1.00 1.06 0.77 0.66 1,44 1,20 
Literal comprehension Upper 2.40 0.44 2.42 0.64 2.58 0.87 ° 
Lower 2.10 0.70 1.90 0.42 2.33 0,42 
Inference comprehension | Upper 1.85 | 0.82 | 1.69 | 0.66 | 1.67 | 0.63 
; Lower 1.52 0.73 1.63 0.69 1.79 0.63 
Total comprehension Upper 4,25 1.06 4.10 1.04 4.25 1.31 
i Lower 3.63 1.21 3.52 0.82 4.13 0.98 
Nonfluent sentences Upper 6.00 1.15 5.08 1.92 5.44 1.58 
Lower 6.46 1.51 6.42 1.49 5.83 1.51 
Vocalization Upper 15.52 | 18.25 2.17 4.08 7.90 | 11.25 
Lower 12.46 15.35 3.83 6.83 5.08 9.60 


Note.—N = 16 at each learning level. 


both vocalization and fluency the echoic 
group was superior to the silent group. 


Comparison of Treatments by Sex 
There was no main sex effect for any of 


the preprimer 


tests but interaction be- 


tween treatments and sex differences was 
found for all comprehension tests and for 
word recognition. It was found from ¢ tests 
that, for word recognition, the boys of the 
oral group were superior to all other groups 
of either sex for the first preprimer test. 


TABLE 3 
Tyre III Anatysts or VARIANCE FoR ALL DEPENDENT VARIABLES 
Dependent variables 
rd Literal Inference Total #baiey Youu 
i compre- re- ion 
recognition | identification renin poet haneion a 
iF Soe 
MS | F | us| F |us| F |us| r|us| r |us| F | us| F 
BAPeS ra Seabee sd persia <4 ESV Dorel cs peers (5 
602,39 380.99 1.28 ' 7.97 498.48 
8.78) 0.01 | 302.68 | 0.82 | 2:20 | 1.04 | 0:18 | 0.08 | 3:90] 0.05 | gcs2| 1.25, [s027.d Hee 
7-06 | 0.01 |1716.00 | 4.66* | 9.03 | 7.64** | 0:59 | 0:38 | 14:29] 3.84 | 38.28] 5.02° 141.6810.38 
338.81 | 0.54 382-60 | 1.04 9.51) 0.48 | 1-27 | 0.82 | 1-85) 0.50 LS eS 1a 
67.42 216.74 ip ios 387 568 370.59 
50g (Pecos |1878-58 | 9.21°%*182-13 le4.saree| 4°85 | 4.02*/129:64151.77°¢*|186.908]111.08***) 128.76)0.29 
roe | Orig | 49-22 | 9-24 | 0.85 | 0.87 | 0:10 | 0-09 | 0:73) 0.81 | 5.46) 3.26° | 108.050. 
180 | O28 | ies | oon | 1.64 B03 | 3-20) tat | 5-43] 2.29 | Tor] 11a | $6 Oty 
ase] [acon | foc | | o05 U8] Bae OF | 5s] 97° | airco9 


Source of 
variance of 
Group@) | 2 

ro 

Level ( ) 1 
xc 2 
Error 90 
Within-S 192 
Tests (A) 2 
AXB 4 
AXC 2 
AXBXC 4 
Error 180 
Total 287 
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With all the comprehension tests, interac- 
tion was symmetric. Girls of the echoic 
group scored high, while boys of this group 
seored low; for the silent group, boys 
scored high and girls low; in the oral group, 
performance was similar for boys and 
girls, This effect, except for inference com- 
prehension, was due largely to the results 
of the third preprimer test. 


Reading Posttest 


There was a levels effect in the expected 
direction for word recognition and identifi- 
cation but not for comprehension. There 
was no treatments effect for any test, nor 
interaction between levels and treatments. 
a were any significant sex differences 
‘ound. 


Discussion 


Practice in giving an echoiec or oral re- 
sponse before reading silently caused no 
improvement in word recognition, word 
identification, or comprehension. This is 
not in agreement with the earlier study 
(Neville, 1966) where echoic Ss were su- 
perlor, or with the experiment of McNeil 
and Keislar (1963) using individual read- 
ing practice booths, where an oral response 
group was superior in silent reading to a 
Covert response group. But neither did the 
tather passive listening period prior to the 
echoie response seem to have the adverse 
effect on reading postulated by Buswell 
(1922) or Duggins (1958). 

_One possible reason for the lack of sig- 
nificant. differences may have been the 
hature of the silent reading. In the train- 
ing periods, true silent reading by the 
Silent group was very difficult to achieve as 
the children whispered whenever they 
thought they were unobserved. Hildreth 
(1955) believes that the beginner virtually 
*annot read silently, so strong is the con- 
rection between printed words and speech: 
n the present study, the social interaction 
(Mace, 1966) of the group reading situa- 
7on may also have encouraged imitation of 
Vocalization, 

» Ne actual vocalization scores of the 
4 Ole group appear, on inspection of Table 
4 to lie between those of the oral and 

ent groups but, because of the unequal 
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variances of the groups, the only significant 
differences were between the oral and 
silent, and echoic and silent groups. Treat- 
ments effect showed at the first test and 
thereafter changed little. During the train- 
ing periods, too, the lack of vocalization 
in the echoic and oral groups contrasted 
sharply with the vocalizing of the silent 
group. It is possible that the extra, legiti- 
mate practice in associating the sounds of 
words with their symbols hastened the 
process of cue reduction and reduced 
vocalization in the oral and echoic groups. 

Although giving an oral response im- 
proved fluency of oral reading at the start 
of the experiment, this effect did not con- 
tinue.* During the third preprimer, with 
some of the better oral-group readers, a 
monotone chanting developed, perhaps as a 
reaction against the boredom of so much 
oral reading. The echoic group showed con- 
tinuing improvement which, by the third 
test, gave scores significantly superior to 
the silent group. 

Increased fluency did not produce the 
predicted improvement in other reading 
skills, particularly comprehension, but pos- 
sibly the effect would have shown with a 
still longer experiment. Alternatively, the 
difficulty of the comprehension questions 
and a tendency to guess their answers may 
have obscured treatment differences: The 
decline in comprehension scores over the 
three tests appeared to be due to too rapid 
an increase in difficulty. f 

Contrary to MeNeil and Keislar’s 
(1963) suggestion, learning level did not 
affect the success of the phy racing 

up—or of any other treatment group. 
The Tack of simple levels effect in the word 
recognition results was to be expected since 
all the children supposedly knew the words 
of the preprimers before coming for train- 
ing. In the posttest comprehension and the 
inference comprehension, the levels effect 
was possibly obscured by the low scores of 
all pupils on these, the most difficult tests. 

In discussing sex differences which, in 
beginning reading, usually favor girls, Mc- 
Neil (1964) suggests that they may occur 
because boys receive more negative com- 
ments than girls during reading and fewer 
opportunities to read. Possibly the very 
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rigid form of all reading lessons in the ex- 
periment reduced these effects. The inter- 
action with word recognition, showing that 
boys from the oral group performed better 
initially, should be viewed very cautiously 
and requires further study. The sex inter- 
action for comprehension is interesting in 
that just the opposite effect was found in 
the pilot study (Neville, 1965). The boys’ 
success in the earlier study was postulated 
to be due to their superior listening ability. 
Possibly the boys of the echoic group in 
the present study continued to rely on their 
good listening comprehension when answer- 
ing the comprehension questions, while the 
girls paid more attention to the reading. 
Silent reading practice, which necessitated 
attention to the text for full comprehen- 
sion, was perhaps better training for the 
boys for reading comprehension, especially 
of the more difficult third test passage. 
Some features of the experimental design 
may be of interest. The study attempted to 
carry out research in a situation approxi- 
mating the classroom, at the same time 
trying to control some of the variables 
thought to confound much classroom re- 
search. (a) To keep the teacher’s influence 
the same for all groups during training, 
E taught the whole sample. There were still 
teacher variables operating, especially in 
the classroom teaching of words (although 
no significant differences were found be- 
tween the word-recognition scores of 
classes of the same level) but these effects 
were equalized within each treatment 
group, (b) The Hawthorne effect was com- 
parable for all groups including the silent 
or control group. (c) The reading methods 
were prescribed precisely and rigidly fol- 
lowed during training, in the hope that 
treatments effect could be fairly attributed 
to the one differing variable, type of re- 
sponse practice. The children did not ob- 
ject to the unvaried form of the training 
and at the end of the experiment only six 
children, all good readers from all three 
groups, signified that they disliked reading. 
(d) The research continued training and 
testing over a reasonable length of time. 
The chief problem revealed by the study is 
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the difficulty of gaining a reliable assegs. 
ment of such aspects of beginning Teading 
as comprehension and fluency when Ss have 
so rudimentary a reading ability. 

The reduced vocalization and improved 
fluency of the overt response groups give 
some support to the linguists’ contention 
that the child needs to convert written 
words to familiar speech and suggest that 
the reading program should include oral 
and echoic responses as well as silent read- 
ing. Moreover, the natural tendency for 
beginners to vocalize should not cause con- 
cern since, at this stage, overt responses ap- 
pear to foster reading achievement. 
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INTERACTION EFFECTS OF ITEM-DIFFICULTY SEQUENCE 


AND ACHIEVEMENT-ANXIETY REACTION 
ON ACADEMIC PERFORMANCE 


DAVID C. MUNZ ann ALBERT D. SMOUSE 
University of Oklahoma 


In order to test the assumption that individual reaction to test taking 
mediates the effect of item-difficulty sequence on performance, college 
freshmen were randomly assigned a final examination with items se- 
quenced either hard to easy (H-E), easy to hard (E-H), or at random 
(R), and were then classified within each sequence as to achievement- 
anxiety type. As predicted, a 3 (item-sequence) X 4 (reaction-type) 
analysis of variance (NV = 120) yielded significant F ratios (p < 01) 
only for reaction type and interaction; however, several specific per- 
formances were significant in the nonpredicted direction. Results are 
explained using the inverted-U hypothesis and the assumption that 


item sequences are progressively more arousing in the order of R, 
E-H, H-E. Implications of the research are discussed. 


Recent investigations of the standard 
test construction practice of arranging 
test items in an order of increasing diffi- 
culty, that is, the easier items first fol- 
lowed by the progressively more difficult 
ones, have found no empirical justification 
for such a practice (Brenner, 1964; Smouse 
& Munz, 1968). There appears to be little 
influence of an easy-to-hard item-diffi- 
culty sequence, as compared with a hard- 
to-easy or random sequence, on academic 
achievement scores when group measures 
are used. However, no attempt has been 
made to investigate the effect on per- 
formance scores of an interaction of per- 
sonality factors typically found in the 
test-taking situation and different item- 
difficulty arrangements of test items. 

There is an abundance of literature on 
personality factors which influence test- 
taking behavior in the classroom. One such 
factor is the differential reactions of in- 
dividuals to test-taking anxiety. Until 
recently many investigators viewed anx- 
iety as a unidimensional personality trait. 
Alpert and Haber, authors of the Achieve- 
ment Anxiety Test (AAT; 1960), have pre- 
ferred to view test-taking anxiety (achieve- 
ment anxiey) as a bidimensional construct 
which may have facilitating as well as 
debilitating effects on academic perform- 
ance. For some individuals an anxiety- 

provoking situation, such as a typical 
college examination, facilitates their per- 
formance while for others it depresses per- 


formance. Further, there are those indi- 
viduals whose test performance is not af- 
fected by anxiety-provoking situations, 
either by improving or depressing their 
scores. “Thus, an individual may possess & 
large amount of both anxieties or of one 
but not the other, or of none of either 
[p. 213].” 

The major argument supporting the ar- 
rangement of test items in an easy-to-hard 
(E-H) difficulty sequence has been that an 
E-H arrangement decreases test-taking 
anxiety, thereby facilitating performance. 
In comparison, a random (R) arrangement 
does not affect test-taking anxiety, and 
therefore does not influence test perform- 
ance, while a hard-to-easy (H-E) arrange- 
ment increases test-taking anxiety thereby 
depressing performance. The purpose of 
this study was to investigate the notion 
that item-difficulty arrangement does sig- 
nificantly affect performance scores but 
only by interacting with test-taking per- 
sonality factors. More specifically, differen- 
tial reactions to test-taking anxiety, 98 
measured by the AAT, interacting with 
three item-difficulty arrangements of test 
items (E-H, H-E, and R) has an effect 00 
achievement-test scores. It was hypoth- 
esized that because of the anxiety-genel- 
ating effects of the H-E arrangement of test 
items those individuals whose performance 
is improved under anxiety-provoking situa- 
tions score significantly higher on the H- 
sequence than those individuals whose 
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performance is depressed or not affected 
by test anxiety. Further, because of the 
anxiety-reducing effects of an E-H ar- 
rangement those individuals whose per- 
formance is depressed under anxiety-pro- 
yoking situations score significantly higher 
on the E-H sequence then those individuals 
whose performance is facilitated or not 
affected by test anxiety. Moreover, indi- 
viduals whose performance is not affected 
by test anxiety are also not affected by 
item-difficulty arrangement. The research 
reported here was designed to test the 
following specific hypotheses: 

Hypothesis 1, Item-difficulty arrange- 
ment of test items, as a main effect, does 
not significantly affect performance scores. 
This hypothesis is in accordance with re- 
cent findings. 

Hypothesis 2. Differential reactions to 
test-taking anxiety, as measured by the 
AAT, significantly affect performance 
scores, 

Hypothesis 3. Item-difficulty sequence 
(E-H, H-E, and R) and achievement- 
anxiety reaction types (facilitators, de- 
bilitators, nonaffecteds, and high-affecteds) 
Interact to produce a significant effect on 
performance scores. 

Hypothesis 3a. Facilitators, those Ss 
Scoring relatively high on the facilitating 
Scale (AAT+) and relatively low on the 
debilitation scale (AAT—), perform sig- 
Nifeantly better on the H-E arrangement 
than do the other three reaction types. 
Hypothesis 3b. Debilitators, those Ss 
Scoring relatively high on AAT— and rela- 
tively low on AAT-+, perform significantly 
etter on the E-H arrangement than do the 
Temaining three reaction types. 
polupothesis 8c. Nonaffecteds’, Ss low on 
i AT + and AAT—, performance is 
is Significantly affected by item-difficulty 
ma eement. No further predictions were 
made regarding high-affecteds, those Ss 
“oring high on both AAT+ and AAT—. 


MetHop 
Subjects 


mnie, Ss were 120 male and female students en- 
Ree four sections of an introductory psychol- 
The gu, taught at the University of Oklshoma. 
“8s were chosen from 181 students who had 
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filled out the AAT which had been presented as a 
research Project of the psychology department. 
The Ss were informed that the information pro- 
cured from the questionnaire would be used for 
research purposes only. 


Instruments 


The AAT was designed to measure the re- 
ported effects of anxiety experienced in test- 
taking situations. This instrument distinguishes 
between different degrees of anxiety that is re- 
ported by the respondent as either facilitating or 
debilitating to test performance. Each type of 
anxiety is measured by a separate subtest of items 
(AAT+ scale and AAT— scale) together com- 
prising a 19-item questionnaire. . 

The second independent variable consisted of 
three sequences of items on a final examination for 
an introductory psychology course. These se~ 
quences, constructed for use in another study 
(Smouse & Munz, 1968), contained the same 100 
multiple-choice items (four alternatives) differ- 
ing only in item-difficulty order (H-E, E-H, and 
R). The 100 items were selected from a pool of 
197 items administered as a final examination to 
931 students in the previous semester's introduc- 
tory psychology course. Item analysis of the 197 
items (N = 931) yielded item-difficulty values de- 
fined as percentage of students passing a given 
item (Nunnally, 1959). Those 100 items selected 
from the pool and having a relatively even spread 
of item-difficulty values over a range of 17.7- 
96.6% were arranged in the three item-difficulty 
orders: E-H, H-B, and R. 


Procedure 


The AAT was administered in class to four sec- 
tions of an introductory psychology course 1 week 
prior to the final examination. Three forms of the 
final examination, H-E, E-H, and R, were admin- 
istered to all sections of the introductory psychol- 
ogy course and were randomly distributed within 
each section. Each section took the final exam un- 
der its own instructor and was given as much 
time as needed to complete the examination. The 
students recorded their answers on answer sheets 
which were electronically scored. 

The achievement-anxiety types were con- 
structed by selecting Ss from each item-arrange- 
ment group in the following manner. An AAT+ 
score and an AAT— were obtained on each S after 
which the AAT— score was subtracted from the 


i defined as debilitators. For all re- 
parcdy two scores were summed and 
ranked, The top 10 Ss in the resulting distribution 
were defined as high-affecteds while the bottom 10 


defined as nonaffecteds. 
ae ga examination data (total number of 


items answered correctly) for the 120 Ss were sub- 


372 


jected to a 3 (item-difficulty order) x 4 (achieve- 
ment-anxiety types) analysis of variance. 


REsULTS 


Table 1 presents the analysis of vari- 
ance results for the total sample along with 
the simple main effects analysis. Consistent 
with Hypothesis 1, the analysis revealed 
no statistically significant item-difficulty 
order effect upon performance scores. There 
was, however, a statistically significant 
effect of the achievement-anxiety reaction 
variable on performance scores (F = 4.32, 
Pp < .01). These results supported Hypoth- 
esis 2. Probing with the Neuman-Kuels 
Test (NKT) revealed that only the dif- 
ference between the facilitators and the 
debilitators (p < .01) and the difference 
between the facilitators and nonaffecteds 
(p < .01) were significant, facilitators 
scoring higher in both instances. 

There was a statistically significant 
interaction among the three item-difficulty 
orders and the four achievement-anxiety 
types (F = 3.22, p < .01); hence, Hypoth- 
esis 3 was supported. A simple main effects 
analysis indicated that within the achieve- 
ment-anxiety factor the R arrangement and 
E-H arrangements were significant (F = 
5.36, p < 01; F =4,p < .01, respectively). 
Examining these two item-difficulty orders 
with the NKT revealed that (a) on the 
R_ form, facilitators and high-affecteds 


TABLE 1 
ANALYsIs OF VARIANCE SUMMARY FOR Prrroru- 
ANCE Scorzs as A Funcrion or ACHIEVEMENT- 
Anxiety Typz anp Irem- 
Dirricuury Szquence 


Source of variation MS P 
Item-difficulty se- 
quences (A) 2 | 107.50 | 1.05 
Achievement-anxiety 
types (B) 3 | 442.41 | 4.398 
B for Ay 3 582.50 5.36" 
B for A: 3 | 409.37 | 4.00* 
B for Ay 3 108.63 1.06 
AXB 6 | 329.04 | 3.228 
Error 108 102.32 
Total 119 
ss 
Note.—Abbreviated: A, = random sequence, 
As = easy-hard sequence, A; = hard-easy se- 
quence. 
*p < 01. 
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FACILITATORS 
NON~AFFECTEDS 


DEBILITATORS 
HIGH~AFFECTEDS 


MEAN PERFORMANCE SCORE 


R E-H H-E 
FORM 
Fra. 1, Mean performance scores for achieve- 
ment-anxiety types (facilitators, debilitators, high- 
affecteds, and nonaffecteds) on item-difficulty ar- 
Tangements consisting of random (R), easy to 
hard (E-H), and hard to easy (H-B). 


scored significantly higher than the de- 
bilitators (p < .05) and nonaffecteds, 
(p < .01), and (b) on the E-H form, facili- 
tators scored significantly higher than the 
other three anxiety types (p < .05; see 
Figure 1), The specific interaction find- 
ings contradict the subhypotheses formu- 
lated under Hypothesis 3. 


Discussion 


In accordance with recent investigations, 
the results of this study suggest that the 
standard test-construction practice of at 
ranging test items in an order of increasing 
difficulty is not justified from the argument 
that an E-H item-difficulty sequence pro- 
duces higher performance scores than a0 
R or H-E arrangement. There appears to 
exist a more complicated relationship be- 
tween item-difficulty arrangement of test 
items, specific personality correlates, and 
achievement-test performance. 

Several of the interaction effects were 
unexpected. First, there were no significant 
differences among reaction types 00 a 
H-E form. Hypothesis 3a had predicte 
facilitators’ performance to be oe 
to those of the other reaction types. regi y 
for the E-H form, the only significantly 
superior performance was that of the fact : 
tators, not the debilitators as had been a 
dicted in Hypothesis 3b. Third, althou 


Trau-Dirricuurr Saquence ax ACHIEVEMENT Anxiety 


MEAN PERFORMANCE SCORE 


LEVEL OF AROUSAL 


Fic, 2. Curve representing the inverted-U hypothesis, here showi i 
performance as a function of item-difficulty sequence. Plottings siigleclnaret faake ta 
oretical curve show mean performances of the various anxiety-reaction types for the ran- 


dom sequence. 


no predictions were made with regard to 
the ranking of mean performances within 
the R sequence, it was nevertheless sur- 
prising to find the facilitators and high- 
affecteds clustering significantly above the 
debilitators and nonaffecteds. Although, 
strictly speaking, the interaction hypoth- 
esis (Hypothesis 3) was supported inas- 
Much as the mean performances of the 
various reaction types fluctuated differen- 
tially from sequence to sequence, the pat- 
tern of fluctuations clearly calls for an 
alternate set. of hypotheses. One plausible 
pote hoc explanation can be made on the 
ee of two assumptions. The first involves 
~ inverted-U function, and the second 
on to do with the relative arousal poten- 
als of the three item-difficulty forms. 

The inverted-U hypothesis states that 
‘ havioral efficiency varies as a curvilinear 
Cie of what has been variously re- 
‘ke to as “arousal” (Malmo, 1959), “drive 
tne (Easterbrook, 1959), and “activa- 
hou evel” (Fiske & Maddi, 1961). This 
unction, shaped roughly like an inverted 
wi implies that there is a degree of arousal 
ae is optimal for performing a given 
divi If an individual or group of in- 
Whigs are functioning at a drive level 
i ich is greater or less than optimum for 
ne tticular task, then performance on 

‘at task is impaired. Thus, if one assumes 
fe differential reactions to achieve- 
the t°st anxiety on the R form places 

Teaction types on the performance 


curve as shown in Figure 2, this assump- 
tion alone will explain why the facilitators 
performed significantly higher on the ex- 
amination than the debilitators and non- 
affecteds, and why the nonaffecteds per- 
formed at the same level as the debilitators. 
The significantly superior performance of 
the high-affecteds over the nonaffecteds and 
debilitators is consistent with the place- 
ment of the high-affecteds higher on the 
performance curve. The inverted-U hypoth- 
esis becomes even more plausible when 
called upon to explain the total interaction 
effects. 

The unexpected interaction findings can 
be explained if the second assumption is 
made, namely, that each of the item ar- 
rangements produces a different degree of 
arousal, increasing in the order of R, E-H, 
and H-E. If one combines this assump- 
tion with the assumption that the various 
achievement-anxiety reaction types lie ini- 
tially on the performance curve as shown 
in Figure 2, then as one moves in the di- 
rection of increasing stress, that is, from 
R to E-H to H-E, 11 of the 12 results 
plotted in Figure 1 are predictable (com- 
pare Figures 1 and 2). Since the nonaf- 
fecteds’ performance is lower than the 
facilitators’ and high-affecteds’ perform- 
ance due to their low level of arousal, their 
performance improves proceeding across 
test forms. Since the debilitators’ perform- 
ance is lower than the facilitators’ and high- 
affecteds’ performances due to the debili- 
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tators overly high level of arousal, the 
debilitators’ performance does not improve 
across forms. The facilitators’ and high- 
affecteds’ performances follow the same 
rationale. 

Of more practical interest, the finding 
that the facilitators and high-affecteds per- 
formed significantly higher on the random 
arrangement than did the debilitators and 
nonaffecteds indicates that performance 
differences due to differences in achieve- 
ment-anxiety reaction are not simply a 
product of specific item arrangements, but 
probably exist in the test forms typically 
found in the classroom. Further, the E-H 
order favors primarily the nonaffecteds 
while actually depressing the high-af- 
fecteds’ performance. It is recognized, how- 
ever, that the arousal value of the E-H 
form may lie in the clustering of difficult 
items at the end of the exam so that the 
practice of placing a few easy items at the 
beginning of an otherwise randomly ordered 
test may not have the same dramatic 
effect. But the implications of R sequencing 
for classroom grading must be contended 
with. If this study is generally valid, then 
the H-E sequence provides least variance 
attributable to personality factors and 
should be used when one is attempting to 
assess only academic achievement. If the 
personality variables in question become 
a legitimate part of the assessment, then 
the sequence should be selected accord- 
ingly. 

In addition to serving as a reminder of 
the weaknesses inherent in the intuitive 
approach to test construction, results of 
this study suggest several subsequent ave- 
nues of research. First, how does one ma- 
nipulate the debilitators’ performance up- 
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ward by use of psychometric strategy? 
Are the effects found here constant, across 
various examination situations and Across 
various kinds of test formats such as com- 
pletion, matching, recall, etc.? Also, might 
one unobtrusively program an examination 
by the use of easy and/or difficult items? 
This might be done not only to reduce 
test-anxiety effects which lower achieve- 
ment validity, but also to maximize the 
teaching function of “objective” examina- 
tions, 

Tt is recognized that although the test- 
taking reaction types used as an independ- 
ent variable in this study are based on a | 
theoretical instrument, they have been 
somewhat operationalized and hence need | 
theoretical support. Further research would 
be needed, however, to establish such 
theoretical support on an ad hoc basis, 
and this is presently being pursued, 
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sara Data are reported relating freshmen predictots to independently com- ‘ ; 
; puted grade-point averages for each of the 8 iemestan ot ene lb latae 
uate college residence, A substantial amount of instability of intellec- 
tual performance over this 4-year time span is revealed, Implications 
for college admission research and for policies governing failure and 
probation are discussed. A preliminary theoretical basis for the data 
and for the recommendations is also developed. 


There are hundreds of correlations re- 
ported between high school grades or rank 
in class, college entrance examinations, 
and, the criterion of college grades, Many 
_ of these studies use first quarter. or first 


Semester averages, a good many others use 


first year averages, and a few use 4-year 
cumulative averages. Studies using fresh- 
men predictors and senior grades are al- 
most nonexistent. Several years ago the 
author (Humphreys, 1960) published the 
intercorrelations of semester grade aver- 
ages, each semester’s average being inde- 
pendent of the rest, and compared their 
Intercorrelations to those of successive 
trials in learning a motor skill, of success- 
Sive tests of intelligence, and to the Gutt- 
Man simplex model. Sharply reduced pre- 
dictions of senior grades could be inferred 
from this discussion. More recently Juola 
(1966) has published correlations between 
common predictors and independent semes- 
ter averages that confirm this prediction. 
The author now has data that make pos- 
sible the computation of predictor correla- 
tions and criterion intercorrelations on 
large numbers of cases. The results are 
inpressive and the implications for selec- 
tion of students clear. The data also suggest 
Possible changes in academic dismissal 
and probation regulations. 

a 

*The author wi e vel 

substantial help pearl papeiin Ms ane 
Ofine these data by the University of Illinois 
ra a of Admissions and Records, John Holland 
his ee American College Testing program, and 
=, arch assistant, Diane McGrath. The latter 
Hs Supported by the University Research Board, 


the University of Illinois. 


Mernop te 
3 Semester grade-point averages were computed 
independently for each freshman who entered the 
University of Illinois in 1962 and 1963 for each 
semester he was in residence. Hight independent 
averages were available for the 1962 class and six 
for the 1963 class. Data on high school rank in 
class and on the separate tests and composite of 
the American College Testing (ACT) program 
were added. Intercorrelations were computed sepa- 
rately for the two sexes and for the college in 
which the student: was enrolled, Finally an aggre- 
gate correlational matrix was obtained in which 
each separate correlation was weighted by the N 
on which it was based. Thus any between-groups 
correlations arising from sex or college differences 
were controlled. ! 
An important consequence of the above pro- 
cedure was that the Ns vary markedly from Se- 
mester 1 to Semester 8. The very large drop from 
Semester 6 to 7 was the result of starting the study 
before the end of the senior year for the second of 
the two classes. A good deal of the change in Na 
overall, however, is due to academic selection. Aca~ 
demic dropouts decrease the range of talent and 
attenuate correlation coefficients. Correction for 
this attenuation posed a problem. Standard devia- 
tions could be computed both for the predictors 
and for the criteria. The correction formula using 
ratios of predictor standard deviations assumes 
that the restriction is due only to a cut on the 
criterion scores and that there is homoscedasticity 
in the arrays, defined by the criterion, of the pre- 
dictor scores. The results from the use of this cor- 
rection formula are highly sensitive to variations 
from both assumptions. The formula which uses the 
ratios of the criterion standard deviations assumes 
that the cut is on the criterion only, but it is not 
as sensitive to small variations in the size of the 
standard deviations. It does assume, however, that 
the units of measurement are the same for each 
of the criterion measures. Bee fait, Gere 
int averages seem upon the same 
pe nips Closer analysis leads to doubt 
which cannot be resolved by any empirical test. 
Do University of Ilinois faculty members use the 
same scale of measurement in assigning grades to 
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freshmen and to seniors? Does the decrease in the 
standard deviation of these averages reflect the 
quality of the academic performance, stereotypes 
concerning the performance of freshmen and sen- 
iors, or some combination of the two? i 

Because of the doubts concerning corrections for 
restriction of range of talent it was decided to form 
@ new aggregate correlational matrix in which NV 
was for all practical purposes constant throughout 
the table. It is possible to do this by selecting 
graduating seniors rather than entering freshmen. 
With restriction of range of talent made impossi- 
ble experimentally, differences in correlations from 
Semester 1 to Semester 8 cannot be explained by 
the dropping out of low ability students. 

There is, finally, the problem of possible differ- 
ential reliability of the grade averages from Se- 
mester 1 to Semester 8. Split-half estimates would 
have been possible though laborious to compute. 
It was decided that the adjacent semester correla- 
tions were satisfactory lower bound estimates. 


Resuuts 


The aggregate table of intercorrelations 
of predictors and criteria based upon maxi- 
mum Ns available appears above the diag- 
onal in Table 1. The N for each correlation 
appears below the diagonal. Standard de- 
viations appear in the diagonal of the 
table. Correlations between predictors and 
freshmen criteria have the expected magni- 
tude. Any person who has engaged in 
college admissions research might well have 
estimated these values within a couple 
of hundredths before the study was started. 
He would probably have done less well, 
however, in estimating the correlations 
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with senior grades. The latter correlations 
are certainly nonzero, but the utility of 
the freshmen predictors for increasing the 
academic performance of seniors ig open 
to serious question. 

The near constancy of the adjacent 
semester correlations rules out substantial 
differences in reliability of freshman and 
senior grades as an explanation of 
markedly reduced predictor correlations, 
but the problem of restriction of range of 
talent remains. This question is partially 
answered by the data in Table 2 in which 
N is approximately constant. Standard de- 


viations are again added to the correla- 


tional table in the diagonal. 

Adjacent semester correlations, the lower 
bound reliability estimates, are more 
nearly uniform than before, and the pat- 
tern of correlations is very similar to the 
pattern in Table 1. The intercorrelations of 
the predictors and the correlations of pre- 
dictors with early criteria are attenuated 
as would be expected from the reduced 
range of talent in the graduating seniors 
as compared to entering freshmen. 

The correlations in Table 2 are helpful 
in making interpretations of the data, but 
it is still desirable to have an estimate of 
what the correlations between predictors 
and senior grades would have been if 
there had been no restriction in range of 
talent. Correlations corrected for restrio- 


TABLE 1 
INTERCORRELATIONS or Hicn Scuoot Rank In Crass, Amertcan Contecs Test1Inc ProGRAM 


VARIABLES, AND Grape-PoINT AVERAGES IN Ercur Semesters 
a tee 


Item 


High school rank 
English 
Mathematics 
Social science 
Natural science 
Composite 
Semester 1 
Semester 2 
Semester 3 
Semester 4 
Semester 5 
Semester 6 
Semester 7 
Semester 8 


Note.—N is maximum for each correlation. 
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TABLE 2 
INTERCORRELATIONS OF Hieu Scnoon RANK IN Cuass, AmeRti ILLEG) 
VARIABLES AND GRADE-POINT AVERAGES IN aoe saukeiagl 
Item 1 2 3 4 5 | 6 TY Ba Whsgaeedo ago 4a, |] a3) ida 
High school rank | 1 /15.45) .356) .368) .244) .291) .393| .387] .341| .978 270} .240 
’ : B : ; 3 +256). : 

English : 2 3.32 | .485) .545| .542) .773) .345] .278| .296| .236| 236 ‘ood ia he 
Mathematics 3 4.95 | .395) .471) .764) .279] .189] .171| 171) .145) 162} 156) .121 
Social science 4 4.33 | .637| .802) .279] .244) .188] .198| .210| .225| .174) .149 
Natural science 5 4.21 | .829) .306] .255] .177| .200] .184} .202/ .159| .126 
Composite 6 3.35 | .375] .298) .237| .255] .238] .252| .219| .173 
Semester 1 7 -57 | .556) .456) .439] 399) .415] .387] .342 
Semester 2 8 57 | .490) .445] .418) .383) .364] .339 
Semester 3 9 -57 | .562) .496] .456] 445] .354 
Semester 4 10 54 | .512) .469] .442) .416 
Semester 5 ll -59 | .551) .500) .453 
Semester 6 12 158 | 5441 489 
Semester 7 13 59 | .541 
Semester 8 14 59 


Note.—N is approximately 1,600 for each correlation. 


tion of range of talent utilizing the ratios 
of standard deviations of the criteria in 
each of the eight semesters are presented in 
Table 3. (Only two decimal places are 
retained for corrected values.) It is per- 
haps a little reassuring with respect to the 
corrected values to note the constant 
size of the standard deviations for the 
criterion variables in Table 2. Since there 
cannot have been any change in range of 
talent, the standard deviations should be 
identical if the scales of measurement used 
are identical, 

An internal check on the accuracy of 
the corrections for restriction of range is 
also available. The obtained values in 
‘able 2 can be used to estimate the re- 
lationships in Table 1. The comparison 
between estimated and obtained correla- 
pene tests the adequacy of the formula 
or these data. A good fit leads to greater 
tonfidence in the Table 3 results. Such a 
t is indicated by the mean algebraic dis- 
my between obtained values in 
gible 1 and values estimated from Table 
« of —.003, The mean absolute discrepancy 
8 only .023. 

: The reduction in validity in Table 3 
dra, waction of semester in college is 
Hines Common variance between high 
a Col rank in class and college grades 

Anges from 26% in Semester 1 to about 

% in Semester 8. The composite score 


from the academic ability test and each of 
its components show similar patterns al- 
though at a somewhat lower level of com- 
mon variance. The data suggest that people 
are changing and that “aptitude for college 
work” is far from stable. 

It would be desirable to have retest data 
on the ACT at the end of the senior year 
in order to pin down the explanation 
phrased in terms of change in people 
since there may be other logically per- 
missible explanations of the present data. 
Nevertheless, the phenomenon is so gen- 
eral (Humphreys, 1960, 1967) and in the 
present data so consistent among colleges 
and sexes, for example, from males in the 
homogeneous College of Engineering to 
either males or females in the heterogene- 
ous College of Liberal Arts and Sciences, 


TABLE 3 
CorrELATIoNS CorRECTED FOR RESTRICTION OF 
Rance or TALENT BETWEEN PREDICTORS 
AND CRITERIA 


Item 
a}2}3]4}]5] 6) 7/8 

i ol rank |.513].45].37|.36).31|.32|.30).28 
ea "401| 35] .27| .25| .22) .22|.24) .20 
Mathematics 396] .30).25] .23] 20} .20).18}.15 
Social science .365|.33|.27| .26] 24) .27).19).17 
Natural science —_| .364] .30).24) .23).22}.23) .20).16 
Composite A7A| 40] .32} .81] 28} .29].25).21 
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that logical alternatives appear to be low 
probability alternatives. 


Discussion - 


Is the change in intellectual perform- 
ance during the college years the result of 
continuing maturation which proceeds at 
a differential rate from one person to 
another or is the change due to the stimu- 
lation and viscissitudes of the college en- 
vironment? An overall increase in intel- 
lectual ability, which would suggest growth 
or maturation, would be relevant, but 
there is no direct evidence concerning this. 
It might be argued that the kinds of per- 
formance measured by an intelligence test 
such as the Stanford-Binet do not show 
much increase after 16. On the other hand, 
there is every reason to believe that stu- 
dents are learning something intellectual 
while in college and that certain kinds of 
intellectual performance are increasing in 
level. Since it is difficult to support any 
fundamental difference between aptitude 
and achievement (Humphreys, 1962) it can 
be assumed that there have been changes 
in the intellectual level of the group as a 
whole in addition to the changes in the 
rank ordering of individuals that were ob- 
served, There may well be both biological 
growth factors and environmental stimu- 
lation and deprivation factors involved, 
but from certain points of view an ex- 
planation is immaterial. The empirical fact 
remains that there is a good deal of in- 
stability in intellectual performance dur- 
ing the 4-year undergraduate period and 
as a result the correlations of predictors 
with criteria show a great deal of shrinkage 
over this period of time. Senior performance 
is not, predicted well enough from freshman 
information for one to be at all content 
with present college admission practices. 

The author has no better selection in- 
struments to recommend than the present 
ones. These data do suggest, however, 
some altered approaches to admissions re- 
search. Motivational and interest variables 
have not shown much promise as pre- 
dictors of freshmen grades. It is probable, 
however, that so much of. the reliable 
variance in freshmen grades is associated 
with high school senior academic per- 
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formance, which is measured both by rank 


in class and by the ACT, that there is 
little left to be associated with nonintel. 
lectual variables. A difference score or a 


residual score computed from freshmen . 
and senior grades would have sufiicient: 


reliability to be predictable and would be 
an appropriate criterion against which to 
validate nonintellectual variables. 

A very different approach to selection 
research may be sounder theoretically and 
practically than the use of grades or a 
criterion derived from grades. Perhaps ad- 
mission tests should be validated primarily 
against staying in college versus dropping 
out. While this criterion would currently 
in most colleges be heavily contaminated 
with freshman academic performance, this 
is not necessary. It is argued below that 
the data presented here suggest changes in 
failing and probation regulations. Such 
changes, if made, would have a substantial 
effect on the nature of the dropout ori- 
terion and would allow colleges to retain 
a substantial number of potentially bac- 
calaureate-level students. 

‘There is indeed ample basis for discon- 
tent with most procedures concerned with 
placing students on probation and dropping 
students from college for academic de- 
ficiencies. It is obvious that a good many 
students who are dropped at the end of 
the first semester would do acceptable work 


later in college. Students are also placed 


on unrealistic probation requirements (for 


example, the lower the academic perform — 


ance, the higher the requirement the next 


term) and are subsequently dropped, who _ 
would be able to do acceptable work in 


later years. There seem to be two implicl 
assumptions used in establishing failing 
and probation regulations. One cones 
ability, the other motivation. If a ee 
has the necessary ability, probation sho : 
put him on his mettle, and he will pit 
through if he works. If he works and bit 
not come through, he did not have ia 
ficient innate ability. If he seems, by 5° 2 
measure, to have the necessary shir 4 
ability but does not come through, he 
not work. : 
In place of these all or nothing assum] 4 
tions about ability and motivation, 


: 


Prepicrion oF CoLuece Acapemic Success 


radically different: interpretation should be 
' attempted. The first supposition will be 
that intelligence is not fixed, that there is 
no measurable or inferrable (from meas- 
urements) innate capacity, and that gain 
in intellectual functioning continues in- 
definitely with adequate stimulation in a 
healthy organism. By the same token loss 
- ocours without adequate stimulation. The 

second supposition will be that intellectual 
functioning, either on psychological tests 
or in the classroom, depends on a very 
broad, cumulative, well-learned repertoire 
of skills, knowledge, modes of performance, 
etc. Furthermore, this repertoire increases 
with age and experience. Also for a given 
absolute amount of change, the relative 
change on which test-retest correlations 
depend is smaller when the base is large 
than when it is small; that is, the amount 
of change which is of interest here is a 
function of age and experience. It would 
take considerable time for a student to 
gain enough to change appreciably his 
rank order in his peer group; the amount 
of time required to make a change of a 
given magnitude would be a function of 
age. For very young groups time would be 
measured in months, for older groups in 
years. A dull student can change into & 
bright student, but this always happens 
gradually and the rate of change is slower 
among college students than among grade- 
school students.2 

It would also be expected that some 
Specialized forms of intellectual function- 
Ing are less dependent on the total accumu- 
lation of intellectual skills and knowledge 
than others. The rank order of older, more 
mature students may change more rapidly 
in certain kinds of learning situations 
than in others, Although quite far afield, 
it might be noted that change in rank 
order of performance in a diserimination- 
i! 


i . 
int iter are also biological factors involved in 
ellectual learning that are completely neglected 
a this discussion. Thus the time required to 
Rae the rank order of a student by some given 
aes is also a function of the biological orga- 
0 involved. Since there are no measurement 
Io erations that allow the assessment of these bio- 
sect! factors independently in the human, and 
ce these factors are not within social control, 
ey can be disregarded in this discussion. 
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reaction time task takes place very 
rapidly (Fleishman, 1955). Holding age 
constant, changes in rank order on the 
Stanford-Binet which taps very general; 
old, well established learning would be 
expected to proceed more slowly than 
changes in rank order in learning a 
foreign ‘language. College specialization 
would generally represent a learning sit- 
uation that did not depend on the total 
accumulation as much as does perform- 
ance on the Binet. Thus test-retest cor- 
relations on the Binet should show more 
stability than similar correlations for the 
ACT, but this would reflect the age and 
generality of the learning represented, not 
a difference between aptitude and achieve- 
ment. 

On the basis of the present data and of 
the preceding theorizing it would be de- 
sirable to establish probation regulations 
such that the student would be kept in 
school as long as he was making minimally 
adequate progress toward an acceptable 
graduation average. The goal, in a nut- 
shell, is to give him time to change his 
level of performance. One simply cannot 
predict well enough from freshman aca- 
demic deficiency to senior performance. 
The typical counterargument js that the 
colleges should get rid of marginal stu- 
dents and make space available for those 
of better quality. Because of the instabil- 
ity of performance, however, definition of 
better quality is just as suspect as marginal 
quality. 

There is also an economic argument to 
suggest changes in probation and failing 
regulations. Once the student has invested 
a semester in college study, and once the 
college has invested in the student by ad- 
mitting him, housing him, and teaching 
him, if he can attain graduation standards 
on or near schedule, the social gain is 
maximized, and the expense minimized, 
by retaining him, This approach, inci- 
dentally, should not be termed “coddling,” 
and it need not deteriorate into coddling 
if graduation standards are maintained. 

The problem of admissions research and 


a student time to overcome initial academic 
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deficiencies would indeed change the nature 
of the dropout criterion. It would be- 
come a better measure of the “sticking to 
the task” trait and less highly related to 
initial academic performance. It would 
allow for more covariance between non- 
academic predictors and the criterion. It 
presents, however, a research task of 
major proportions: the finding of good non- 
academic predictors. 
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PROGRAMMED INTRODUCTION TO PSYCHOLOGY VERSUS 
TEXT-BOOK STYLE SUMMARY OF THE SAME LESSON 


MARIANNE RODERICK ann RICHARD C. ANDERSON? 
University of Illinois 


Either the Ist 4 sets of the Holland-Skinner (1961) program or a 
summary of the material contained in the program was given to 85 
college undergraduates and 116 high school seniors. Overall, those 
who completed the program scored higher on the achievement test 
than those who studied the summary. The advantage of the program. 
was greatest (a) with high school rather than college students; (b) 
on the delayed rather than the immediate achievement test; (c) on 
short-answer rather than multiple-choice test items. Among college 
undergraduates, for whom the program was designed, the results 
failed to show better achievement for the program than for the sum- 
mary, but the program took 4 times as long to complete. 


Critics of programmed instruction often 
voice the complaint that programs pre- 
sent material in steps that are unneces- 
sarily small, that they involve too much 
tepetition, and that such features are not 
tequired to produce learning with sophis- 
ticated students. Pressey and Kinzer 
(1964) have completed a study that gives 
empirical support to doubts about the 
efficiency of small-step programs. They 
prepared a succinct, textbook-style sum- 
mary of the first two sets of The Analysis 


 f Behavior (Holland & Skinner, 1961). 


€ summary consisted of 648 words, 
while the section of program upon which 
t was based contained 1,710 words and 
also entailed 84 written responses. Stu- 
dents took eight times as long to complete 
the program as they did to read the 
summary, yet those who received the 
summary scored higher on the posttest. 
Students who completed nine “auto-elucida- 


_tive” questions in addition to reading the 


Summary obtained the highest posttest 
Scores of all. The Pressey and Kinzer 
“periment suffered from methodological 
Shortcomings, Instead of random assign- 
et of Ss to treatments, whole classes 
ee, one treatment or another. The 
—SHest consisted of an essay examination 


1 
a ‘authors are indebted to Thomas Anderson, 
fini aust, and Philip Zediker for assistance in 
chin ig 83 and to Elwood Leslie for help in ma- 
They wo'i2g tests and completing item analyses. 
Y are grateful to the McGraw-Hill Book Co. 


for permis) 
: ission t i The Analy- 
ts of Behawor 0 reproduce sections of 


which was described as “carefully graded” 
but about which no further information 
was provided. 

The results of the Pressey and Kinzer 
study would appear to be inconsistent with 
the findings of other research that has 
employed The Analysis of Behavior. One 
time-consuming feature of this program 
is the frequent requirement to make writ- 
ten responses. Yet Williams (1963) has 
found that overt responding produces 
better achievement than reading the pro- 
gram with filled blanks. Casual inspection 
of The Analysis of Behavior suggests that 
it is a redundant program. It contains 
many groups of frames in which equiva- 
lent responses are required in the presence 
of identical or nearly identical stimuli. 
Herein lies another possible contributor to 
the inefficiency which Pressey and Kinzer 
seem to have found, However, Coulson 
and Silberman (1960) reduced a 104- 
frame section of The Analysis of Behavior 
to 56 frames by removing frames judged 
to be redundant, The Ss who received the 
standard program scored higher on the 
posttest than those who received the short- 
ened version. 

It would be easy to discount the Pres- 
sey and Kinzer study because of its short- 
comings and the contradictory findings 
from other experiments. Still, it does 
seem possible that redundancy and overt 
responding are necessary to produce satis- 
factory achievement given, and only 
given, the constraints of a small-step pro- 
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gram and that these time consuming fea- 
tures are not necessary to attain satis- 
factory achievement from a text. In other 
words, the value of such features may 
depend upon the form of the material in 
which they are included. 

There is a plausible argument for con- 
siderable redundancy. The presumption is 
that many apparently similar encounters 
with the material are necessary in order to 
arrange discriminations among the terms 
and concepts being taught. A single state- 
ment of a principle may be sufficient if 
one’s goal is merely to have the student 
name the principle when it appears in the 
verbatim form employed during instruc- 
tion. If, on the other hand, one wants the 
student to be able to recognize various 
expressions of the principle, to discuss the 
principle fluently in his own words, to 
identify new instances of the principle, and 
to apply the principle to novel cases not 
treated during the course of instruction, 
then it may well be necessary to require 
the student to deal with a variety of forms 
of the principle and a variety of examples. 

There is also a plausible argument for 
requiring the student to make overt, con- 
structed responses. People learn what they 
are led to do. A person may spontaneously 
make appropriate covert responses when 
reading a text. But then again he may 
skim, skip difficult sections, or render the 
material in a way different from that in- 
tended by the author, The argument is 
that the requirement to make overt Te- 
sponses helps to ensure that the student 
will actually make the responses neces- 
sary for learning. 

It remains to be seen whether the 
theoretical advantages of redundancy and 
overt responding are obtained in practice. 
To a greater or lesser degree, depending 
upon the task, the student will already be 
capable of the responses and discrimina- 
tions entailed in a lesson. Because of what 
he has previously learned, the student 
sometimes may almost Spontaneously gen- 
eralize to appropriate stimulus and re- 

sponse classes. These entering behaviors 
may be systematically undervalued by 
programmers who have been exhorted to 
use small steps, keep error rate low, and 
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‘ 
leave nothing to chance. Students may 
often be compelled to endure a lengthy 
program when several pages of clear 
English could evoke the desired perform. 
ance. Commonly employed techniques for 
the development and validation of pro- 
grams do not protect against inefficiency, 
Presumably during the course of tryout 
and revision the programmer will discover 
the instances in which he has underesti- 
mated the difficulty of teaching a concept, 
But what of the instances in which dif- 
ficulty has been overestimated? As Markle 
(1967) has indicated in her excellent 
analysis of the problem, frames upon 
which students make few errors are yery 
unlikely to be eliminated from a pro 
gram. Nor will sections of a program be 
compressed when students do very well on 
criterion test items measuring what these 
sections teach. Programs inevitably grow 
longer rather than shorter when revised. 
There is no empirical technique in use to 
detect superfluous redundancy. ; 

If programmers tend to underestimate 
entering behavior in the intended tanget 
population, an empirical demonstration of 
the value of considerable redundancy an 
overt responding would be difficult. How 
ever, a lesson characterized by a gradudl 
progression of small steps, repetition am 
review, and the requirement to make 
overt, constructed responses might show to 
advantage in a population less skill 
than the intended target population. One 
purpose of the present experiment was 
compare a small-step program and & ts 
book-style summary of the material taug ! 
by the program with students who uals 
grossly deficient in the entering ait 
manifested within the population for My ri 
the program was designed. These stu' a 
were assumed to have available ie 
the responses to be acquired and ie 
assumed unlikely to discriminate sat 
generalize spontaneously in an Pee el 
manner among stimuli and Be 
Consequently, the program was saa wily 
work much better (though not neces pt 
well in abosolute terms) than ee 4 
mary for the students relatively de “i 
entering behavior but perhaps only fi Gi 
better than the summary for 


from the target population. The program 
used in this study, The Analysis of Be- 
havior, was designed for use with col- 
lege sutdents. The program and summary 
were compared with both college and high 
school students. 

The redundancy in many small-step 
programs may not be necessary to produce 
adequate performance on an immediate 
test. However, it is well established that 
repetition and spaced review facilitate re- 
tention. Another purpose of the study re- 
ported herein was to compare a program 
and a succinct summary on both an im- 
mediate and a delayed achievement test. 
The program was expected to show to 
greater advantage on the delayed test 
than on the immediate test. 

The final purpose of the present experi- 
ment was to compare a program and a 
summary on both short-answer test items 
and equivalent multiple-choice test items. 
The student may be adequately prepared 
to recognize a new technical term if he 
has simply read a passage within which 
the term was defined and illustrated. How- 


Merrxop 


Subjects and Experimental Design 


_ Lighty-five college sophomores, juniors, and sen- 
a enrolled in an introductory course in educa- 
vee Psychology and a heterogeneous group of 

6 high school seniors served as Ss. One college S 
Was dropped due to failure to complete the pro- 
Stam, while five high school and seven college Ss 
Were lost because they were absent for the delayed 
Achievement test. . Pi 
, 2X 2X 2 Xx 2 design was employed with 
nye peated Measure defining one of the factors. 
4 : first factor was training method. The Ss com- 
Hes ed either the program or the summary. The 
a factor was S status. The Ss were either high 
biaary Seniors or college undergraduates. Reten- 
Ae interval was the third factor. The final fac- 
ti sees test mode. Both a short-answer and a mul- 
Ple-choice achievement, test were given to all Ss. 


Instructional Mi atervals 


About half of th : 
i e Ss received the standard ver- 
Sion of the first four sets of The Analysis of Be- 
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havior. Each frame occupied a 3% X 8¥% inch 
page; the answer to that frame appeared at the 
top of the following page. The program was 
mimeographed on blue paper through which the 
following page could not be read. Each set was 
stapled along the left margin to form a separate 
booklet. 

The remaining half of the Ss studied a textbook- 
style summary of the first four sets of The Analy- 
sis of Behavior. The summary was initially written 
by “difting” material from the program in the 
order in which it appeared there. The material was 
later condensed and arranged in paragraph form to 
make it readable. Technical terms were underlined 
upon their introduction into the text, but not 
again. One example was used to illustrate each 
principle; no other redundancy was included. Sev- 
eral psychologists read the summary and made 
suggestions for its improvement. All agreed that 
material coverage was adequate. Prior to the ex- 
periment about 50 high school seniors, attending a 
different school from the one in which the experi- 
ment was conducted, and about 25 college under- 
graduates, enrolled in an introductory educational 
psychology course, completed the achievement test 
on an “open summary” basis. They were asked to 
read the summary and then take the test, search- 
ing through the summary to find or verify answers. 
From 31% to 100% of the students answered each 
question correctly. The lowest percentages were 
obtained from the high school seniors on several 
short-answer items. However, none of the latter 
items was answered correctly by any of the high 
school seniors in another group of 50 which was 
not exposed to the summary. These data indicate 
that all of the test items could be answered on the 
basis of material contained in the summary. 

The final version of the summary contained 
1,799 words? The program contained 3,398 words 
and involved 142 written responses. 


Procedure 


The Ss were assigned to treatments by issuing 
them “tickets” from a deck stacked in a prede- 
termined random order. The ticket directed 8 
either to a room in which the summary was em- 
ployed or to a different room in which the program 
was used. The program and summary were not em~- 
ployed within a single room because the program 
takes more time. There could have been a reac- 
tive effect had those completing the program seen 
many others finishing early. path 

The Ss in the program group received directions 
similar to the standard program directions pub- 


ished in The Analysis of Behavior. Those who 


received the summary were told to read at their 


For copies of the summary and the achieve- 
ment test, nla NAPS Document No. 00066 from 
ASIS National Auxiliary Publications Service, 
% CCM Information Sciences, Inc., 22 West: 34th 
Street, New York, New York 10001; emitting 
$1.00 for microfiche or $3.00 for photocopies. 
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own rate and were expressly permitted to reread 
all or part of the material if they so desired. 

All Ss were told that they would receive a test 
when they finished the program or summary. As- 
sistants completed a control sheet upon which 
were recorded the order in which Ss completed the 
treatment and the time required for each to do so. 
As each pair of Ss finished the program or sum- 
mary one of the two was randomly assigned to 
receive the immediate achievement test while the 
other received an irrelevant (verbal reasoning) test 
as a time filler and placebo. This procedure 
equated the immediate and delayed achievement 
test groups in terms of training time. 

High school Ss received the delayed achieve- 
ment test 7 days after the immediate test. The 
interval between the two tests ranged from 6 to 9 
days for the college Ss. The delayed test was not 
announced and the teachers and others cooperat- 
ing in the experiment were asked not to reveal 
that a test would be given again. The delayed test 
was given at a regularly scheduled meeting of an 
educational psychology course in the case of col- 
lege Ss. In the case of the high school Ss assistants 

. made what was presumably an unexpected visit to 
the cooperating school to administer the delayed 
test. There is no evidence that Ss expected a sec- 
ond test. On the contrary, many Ss seemed gen- 
uinely surprised when the delayed test was admin- 
istered. 

The same measure, consisting of 19 short-answer 
items and 19 equivalent multiple-choice items,* 
was used as both an immediate and delayed test. 
For pairs of short-answer and multiple-choice 
items, the item stems were identical, or nearly so. 
In 20 of the test items, the wording of the items 
was essentially the same as the wording of state- 
ments included within the lesson materials and/or 
the items involved the same examples as were used 
to illustrate concepts or principles within the 
lesson materials. The remaining 18 items contained 
wording substantially different from any instruc- 
tional statement and/or the items entailed new 
examples not included within the lesson materials. 
On each occasion the short-answer section of the 
test was administered first and collected before the 
multiple-choice section was distributed. The short- 
answer section was scored on the basis of a cri- 
terion list of acceptable answers. The multiple- 
choice section was machine scored and corrected 
for guessing. ) 


Resuuts 


Table 1 contains the achievement test 
means for the various experimental con- 
ditions. Since there were disproportionate 
numbers of cases per cell, an unweighted 
means analysis of variance, summarized 


*There were six additional multiple-choice 
items for which there were no matching short- 
anwer items. The results with these six items are 
not reported herein, though these results did par- 
alle] those obtained with the rest of the test. 
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TABLE 1 


Mean Percent Correct on THE 
ACHIEVEMENT Trst 


High school College | 

Condition 
Immediate] Delayed |Immediate| Delayed 

Program 
N 30 28 19 19 
SA 48.1% | 40.0% | 88.9% | 73.1% 
MC 45.3% | 41.5% | 83.9% | 77.8% 

Summary 
N 28 25 22 aigeene 
SA 33.3% | 29.3% | 84.4% | 57.6% | 
MC 38.3% | 36.8% | 84.0% | 71.5% 


Note—Included are delayed test scores of 
only Ss who were taking the achievement test for 
the first time. Abbreviated: SA = short answer, 
MC = multiple choice. 


in Table 2, was performed. This analysis 
did not include the delayed achievement 
test scores of Ss who had also completed 
the immediate test, but only the delayed 
test scores of Ss who had received an ir- 
relevant test immediately after the treat- 
ment and who were therefore taking the 
achievement test for the first time. 

All four main effects were significant. 
The unweighted mean percent correct 0 
the achievement test was 61.7 for thos 
who completed the program and 54.4 for 
those who read the summary, 39.1 for 
high school students and 77.1 ‘for college 


TABLE 2 
Ananysis OF ACHIEVEMENT Test VARIANCE 
Source df MS F 
Between Ss A 
ini: thod 1 87.18 5.06 
Subject status “a 1 | 2349.82 | 152,48" 
tion interval (R) 1 187.54 8.02 
MXS o 1 6.42 42 
*MXR 1 6.67 43 
SxR 1 38.65 2.51 
xSxXR 1 20.93 1.36 
is within groups 180 15.41 
Within Ss 
‘Test mode (T) 1 44.38 
MXT 1 26.26 
SxT st 2.36 
RXT 1 34.26 
MXSXT 1 1.38 
MXRXT 1 a7 
XS XR X i oe 
Ss Within Groups x T 180 5.33 


of only 
Note—The analysis involved the delayed test scores 
Ss who were talking the achievement test for the first fie TS 
analysis was completed before mean scores were com 
percentages. 
sp < .05. 
**p < 01. 


students, 62.6 when the test was given 
immediately and 53.5 when it was delayed, 
and 56.2 on the short-answer items and 
59.9 on the multiple-choice items. 

There were two significant interactions. 
The Training Method X Test Mode interac- 
tion is graphed in Figure 1. The figure 
shows that the advantage of the program 
over the summary was greater on short- 
answer items than it was on multiple- 
choice items. Figure 2 pictures the Reten- 
tion Interval x Test Mode interaction. 
There was a smaller decrement over the 
retention interval on multiple-choice items 
than on short-answer items. 

Completed also was a second analysis 
involving only delayed achievement test 
scores. The new variable of interest, which 
turned out to make a significant difference 
(F = 8.68, df = 1/166, p < .01), was 
whether S had received the immediate 
achievement test or an irrelevant test in 
its place. Those who received the immediate 
achievement test showed an unweighted 
Mean percent correct of 63.5 on the de- 
layed achievement test while the percent 
torrect for those who received the irrele- 
vant immediate test was 53.5. Whether or 
ot S took the immediate achievement 
st interacted significantly with test mode 
F = 9.75, df = 1/166, p < .01). Taking 


( 


the immediate achievement test (see Figure 
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Be 1. Percent correct on the achievement test 
ary age; 2? received the program or the sum- 
48 a function of test mode. 
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*———* MULTIPLE CHOICE 
@-----® SHORT ANSWER 


ay 
So 
1 


a 
te} 


PERCENT CORRECT ON ACHIEVEMENT TEST 
a 
oO 


ro} 


IMMEDIATE DELAYED 
TEST 


RETENTION INTERVAL 


_ Fic. 2. Percent correct on short-answer and mul- 
tiple-choice items as a function of retention inter- 
val. 


3) had a greater effect on performance 
on short-answer items than on multiple- 
choice items contained in the delayed 
test. 

Table 3 contains mean training times. 
Notice that Ss spent about five times as 
long to complete the program as they did 
to read the summary. 


Discussion 


Like Pressey and Kinzer (1964), the 
present authors found that college under- 
graduates who complete the initial sections 
of The Analysis of Behavior score no 
higher on an achievement test given im- 
mediately (83.9%) than do undergraduates 
who study a summary of the material 
contained in the program (84.2%). Fur- 
thermore, in the present experiment under- 
graduates spent about four times as long 
working on the program as they did read- 
ing the summary, once again approxl- 
mately replicating Pressey and Kinzer. 
However, unlike the Pressey and Kinzer 
study, which was limited to the perform- 
ance of college undergraduates on an 
jmmediate achievement test, the present 
study showed a significant overall achieve- 
ment advantage for the program. ; 

It was expected, for reasons outlined 
earlier, that the program would be most 
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@——® RECEIVED IMMEDIATE 
ACHIEVEMENT TEST 


@-----@ DID NOT RECEIVE 
IMMEDIATE ACHIEVEMENT 
EST 


g x 


PERCENT CORRECT ON THE DELAYED 
ACHIEVEMENT TEST 


40' 


SHORT 
ANSWER 
TEST MODE 


MULTIPLE 
CHOICE 


Fic. 3. Percent correct on the delayed achieve- 
ment test for groups that did or did not receive the 
immediate achievement test as a function of test 
mode, 


markedly superior to the summary: (a) 
with high school rather than college stu- 
dents; (b) on the delayed rather than the 
immediate achievement test; and (c) on 
short-answer rather than multiple-choice 
test items. Each of the expected trends 
appeared in the data; however, only the 
latter one, the Training Method x Test 
Mode interaction, was statistically signifi- 
cant in the overall analysis of variance. 
However, one-tailed ¢ tests indicated that 
the program led to significantly greater 
achievement than the summary among 
high school students (t 2.14, df ~ 
90, p < .05) but not among college stu- 
dents (t = 1.23, df ~ 90, p > .05); and 
on the delayed test (¢ = 2.15, df ~ 90, 
p < .05) but not on the immediate test 
(€=1.22,df 90,p> .05). 

The fact that receiving the immediate 
achievement test produced a significant 


TABLE 3 
Mean Trarnine Time 1n Minutes 
Condition High school College 
Program 74.31 57.02 
Summary 14.06 14.72 
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increment on the delayed test is not sur- 
prising, since it has been well documented 
that responding to questions or test items 
during or shortly after training facilitates 
later performance, even when, as in the 
present case, no knowledge of results is 
provided (Michael & Maccoby, 1961; 
Rothkopf, 1966; Spitzer, 1939). There is 
the question of whether it is more effective 
to intersperse questions within the instruc- 
tional materials, such as is done in a pro- 
gram, or more effective to ask a series of 
questions after a relatively lengthy pres- 
entation. The latter alternative proved 
more potent in the present study, Con- | 
sidering only performance on the delayed 
achievement test, it made a small (and 
nonsignificant) difference whether S com- 
pleted the program (61.3%) or the sum- 
mary (55.7%) but it made a somewhat 
larger (and significant) difference whether 
he received the immediate achievement 
test (63.5%) or not (53.4%). Among high 
school Ss, the program and the immediate 
test produced increments of the same size 
and these increments were additive. How- 
ever, with respect to college undergradu- 
ates, for whom the program was designed, 
the optimum treatment was the summary 
followed by the immediate test (85.9%): 
This is an instance in which the teach- 
and-test policy often denigrated by advo- 
cates of programmed instruction worked 
best. : 
Test mode interacted significantly with 
both training method and presence oF ab- 
sence of the immediate achievement test. 
In each case the short-answer items were 
more sensitive to the treatment variable 
than the multiple-choice items. It is pos 
sible that short-answer items are more Se* 
sitive than comparable multiple-choice 
items to any treatment difference. How- 
ever the authors prefer a different inter 
pretation. Discounting the fact that it 18 
often possible to eliminate one oF mor 
obviously wrong alternatives when con 
sidering a multiple-choice item, ee 
sumption is that the two kinds of aie 
require associative learning in abou es 
same measure. The big difference ha 
the item types is in the requisite leve 


response learning. An S will not be able 
to emit a poorly integrated response of 
low strength on a short-answer item, but 
he may be able to pick the response term 
from among a set of alternatives. The 
explanation for the greater sensitivity of 
the short-answer items in this experiment 
is that both the program, because of 
the overt response requirement, and the 
opportunity to practice the achievement 
test enhanced response learning. If this line 
of reasoning is correct, multiple-choice 
items might be as sensitive as short-an- 
swer items to treatments which do not 
differentially affect response learning, 

The results of the experiment reported 
herein do provide some support for the 
rationale behind such program features as 
“tedundancy and overt responding. None- 
theless, as a practical matter, the most 
noteworthy finding was that for college 
students the program did not produce 
demonstratably better achievement than 
the summary but took a lot more time. 
The authors want expressly to disavow 
any broad generalizations based on this 
Single instance. Programs that are super- 
ficially similar may have very different 
instructional consequences. Indeed, this 
experiment might have come out differently 
had later sections of The Analysis of Be- 
havior been used. Tf our analysis is cor- 
tect, whether a particular program will 
outperform a summary will depend upon 
the distance between actual entering be- 
/avior and desired terminal behavior, that 
4, “difficulty ;” whether entering behavior 
'S over- or underestimated; whether the 
Programmer has a bias toward “overkill” 
M the amount of redundancy included; 
Whether empirical techniques are employed 
M program construction, testing, and de- 
Velopment that guard against superfluous 
fiutndaney as well as detect gaps in the 
“sk analysis. Not mentioned previously, 
Ut obviously important, are such ad- 
‘tional factors as the completeness of 
* © task analysis and the adequacy of the 
“sign of individual frames. 

Program with one or more defects 
TY fail to outperform a summary. Most 
fects cannot be found in a simple ex- 
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amination of @ program. Nor is the demon- 
stration that students who complete a 
program do well on a posttest a guarantee 
of freedom from defects; students who get 
some other form of instruction might do 
better in less time. The authors should like 
to propose that as a general quality con- 
trol procedure those who develop pro- 
grams accept responsibility for demon- 
strating that their programs outperform 
summaries. The textbook-style summary 
of the material in a program makes a 
feasible trial horse because it can be pre- 
pared inexpensively, almost by formula. 
Because it contains minimal redundancy, 
a summary would be especially useful in 
detecting superfluous redundancy, but it 
could also provide a yardstick to gauge 
other shortcomings. Finally, if it were the 
common practice to compare programs 
with summaries as a step in validation, 
generalizations about the limits of pro- 
gramming techniques, as they are known 
today, with various populations and suv- 
ject matters might thereby arise. 
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THE APPROACH-AVOIDANCE PARADIGM AS A MODEL FOR 
THE ANALYSIS OF SCHOOL ANXIETY’ 


JAMES A, DUNN 
Harvard University 


A Dollard and Miller approach-avoidance paradigm was used to pre- 


dict age, sex, and 


social-class differences in childrens’ school anxiety 


using as a basis the degree to which Ss reported liking and valuing 
the academic aspects of school. Analysis of variance was applied to 
data collected from 480 public school children. Significant results were 
obtained in 8 out of 10 analyses. Major findings were: (a) as children 
grow older negative affect for school increases; (b) lower-class chil- 
dren report greater school anxiety than middle-class children; and (c) 


lower-class children report more positive affect for the social and the 
academic aspects of school than middle-class children. Other lesser 


findings were also reported. 


Academic achievement presumably is 
not only a function of the instructional 
stimuli which impinge on a learner; but 
also of the affective state of the learner 
during the period of that impingement. 
During the past decade, research on sys- 
tems theory, programmed instruction, com- 
puter simulation, and the like (e.g., Lums- 
daine & Glaser, 1960; Ryans, 1963, 1964; 
Suppes, 1966) has materially advanced 
one area of knowledge concerning human 
learning. Somewhat less attention has been 
devoted to understanding the influence of 
motivational, that is, affective-attitudinal, 
factors. 

Most conceptualizations of motivation 
have either been unidimensional, varying 
only in intensity, as in simple drive theory; 
or bidimensional, varying in intensity and 
directionality. Dollard and Miller (1950) 
extended this latter conceptualization to 
permit the consideration of multidirec- 
tionality. 

In the tradition of Lewin (1936), the 
directionality of drive was a function of 
the valence characteristic of the goal. 
Generally, the theoretical problem of multi- 
directionality was resolved either by posit- 
ing an alternating valence condition of the 
object which was due to a changing, often 


*This research was supported, in art, b: 
Grant No. 01428 and the TRUSSO te 
search Commission on Pupil Personnel Services. It 
was based on a paper presented at the American 
Psychological Association annual convention, New 
York City, New York, 1966, 
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oscillating, need structure of the organism; 
or by the juxtaposition of objects having 
different valences, such as food and shock 
grids. 

Regarding human learning, at least 
two attitudes may operate to motivate 
learning behavior: (a) affect for the ma- 
terial to be learned; and (b) the perceived 
value of the material to be learned. Con- 
sequently, some understanding of those 
factors, especially as regards school-rele- 
vant attitude and affect states, would 
seem to be necessary if one is interested 
in maximizing children’s school perform-) 
ance, 

Two sizable modern studies of children’s 
attitudes have been carried out; one by 
Jersild and Tasch (1949), the other by 
Witty (1960). Both contained sections 
dealing with children’s school attitudes. 

Jersild and Tasch’s study was based 00 
questionnaire data collected from ovely 
3,000 children. One of their more striking 
findings was a “decline, with age, % 
children’s educational morale.” At the ele-) 
mentary school level Jersild found. only 
1% of children’s wishes about school to be 
of a derogatory nature. This figure increase 
to 10% with junior high school pupils. 

Using data from 2,000 pupils in Giada 
3-9, Witty found that children liked those 
subjects best in which they received ie 
best grades, He also found that boys wel) 
less interested in the academic aspects ° 
school than girls. 


Recently Phillips (1966) has reportel 


an extensive, semilongitudinal study of 
school anxiety, and its antecedents, based 
- ona sample of approximately 600 children. 
One of Phillips’ major findings, in addition 
to significant sex findings, was a rela- 
tively high level of school anxiety among 
minority-group school children, that is, 
Mexican-American and Afro-American ele- 
mentary school children. 


‘THEORY 


The basic assumption underlying this 
study is that there should be some relation- 
ship between the degree to which a child is 
anxious about school, and the degree to 
which he likes or dislikes and values or 
devalues school. 

A positive orientation on both of these 
dimensions, affect and value, would seem to 
be of crucial importance for a child’s ad- 
justment to, and his sustained performance 
in, school. The pupil who both likes to 
study academic subjects and who considers 
the study of academic subjects important 
Would have a more positive attitude toward 
learning and would, presumably, show less 
anxiety in his performance than a peer 
who valued academic achievement, but 
Was negatively attracted (ie., repelled) by 
the character of the work involved. The 
 *pplicability of the approach-avoidance 
Paradigm to the latter instance would seem 
to be clearly patent. A child who considers 
academics important but dislikes them 
pd be in an approach-avoidance situa- 
- and hence presumably under a good 
“al of personal stress in precisely those 
Situations calling for a high level and 
ality of academic performance. 

Je simplest defense for a child in such 
4 situation would be, of course, either 
4 chological withdrawal via daydreaming 
d/or rationalization, or physical with- 
oe Via truancy and/or school dropout. 
oon thdrawal, either physical or psycho- 
el is impossible, one would then ex- 
rae find a high degree of anxiety, 

x ity, and negative affect. 2 
a School anxiety does, in fact, derive 
sch complex attitude patterns toward 

Col, to the extent one would expect 
ind age, sex, and social-class differences 
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in those attitudes, one could also expect 
to find age, sex, and social-class differences 
in the resultant school anxiety. 

The present study, then, involves two 
sets of hypotheses, the first dealing with 
age, sex, and social-class differences in the 
basie school attitudes, that is, the affect- 
value attitudes; the second, a superordinate 
set, deriving from the former, dealing with 
predicted age, sex, and social-class dif- 
ferences in school anxiety. 


Hypotheses 


It was predicted that the adolescent, 
as contrasted to the preadolescent, would 
(a) place greater value on the academic 
aspects of school, but because of the in- 
creasing stress associated with school and 
presumably an increasing encounter with 
negative evaluations of his work, the 
adolescent would (b) have less positive af- 
fect for the academic aspects of school. 

Because girls in our society typically 
have less social freedom and less need for 
vocational skills, and because they typically 
encounter more academic success in their 
early school years, it was hypothesized 
that girls would (c) have more positive 
affect toward the academic aspect of school, 
but would (d) value those aspects less than 
their male counterparts. 

Because of the growing social awareness 
of the crucial role of education for up- 
ward mobility, it was predicted that 
lower-class children, as contrasted with 
middle-class children, would (e) place more 
value on the academic aspects of school 
than middle-class children; but because of 
their more frequent encounters with school 
failure, especially in the early years, they 
would (f) like the academic aspects of 

1 less. 
greet the basis of the Dollard and Miller 
(1950) approach-avoidance paradigm, it 
was hypothesized (g, h, t) that if males, 
adolescents, and children of lower socio- 
economic status did, in fact, value the 
academic aspects of school but had rela- 
tively little positive affect toward those 
aspects, then those males, adolescents, and 
lower socioeconomic class children would 
manifest more anxiety about school than 
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would preadolescents, girls, and upper- 
middle-class children. 

Although the basic intent of the present 
study was to investigate these hypotheses, 
because of the opportunity afforded, an ad 
hoe investigation of children’s attitudes 
toward the social aspects of school was 
also included. In general, it was tentatively 
expected that adolescents would no longer 
consider the social aspects of school either 
as important or as positive as would pre- 
adolescents; that, because of their greater 
dependence on school for social contacts, 
adolescent girls would both value and like 
the social aspect of school more than boys; 
and that lower-class children would neither 
like nor value the relatively rigid, middle- 
class supervised, social aspects of school 
as much as middle-class children. 
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Because of the desirability of collecting the data 
in situ so that familiar school and classroom cues 
would be present for the child as he responded to 
the instrument, the classroom was chosen as the 
unit of data collection. The child, however, was the 
unit of data analysis. 


Subjects 


Data were collected from 30 classrooms, five 
classes from each social class at each of three 
grade levels (see Table 1). Forty Ss were then 
drawn, using a table of random numbers, for each 


TABLE 1 
Cuaracreristics or Toran SAMPLE 


Pupil distribution 
Socioeconomic class 

Male Female Total 

Grade 5 
Middle 78 58 136 
aoe 65 67 132 

otal 143 125 

Grade 7 8 
Middle 53 59 112 
Lower 58 48 106 
Total 11 107 218 

Grade 9 
Middle 56 49 105 
Lower 52 65 117 
Total 108 114 222 
All grades 362 346 708 


Note.—A total of 30 classrooms were sampled, 
5 for each socioeconomic class in each grade. 
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TABLE 2 
Ace or Sussects SELECTED FoR ANALYSIS: 
AND STANDARD DuyraTIons BY GRapE, 
Socrozconomic Crass, anD Sex & 


Socio. | Fifth erade | Seventh grade | Ninth ge 
economic 

class | Male |Female | Male | Female| Male 
Middle 

10.23 | 10.13 | 12.00 | 12.15 | 14.18 

SD va | 46] 123 | 48 | 7 
Lower 

x. 11.08 | 10.40 | 12.88 | 12.75 | 14.78 
sp ‘e3 | cos | 88 | [74] 80 


of the 12 cells of the analysis. Thus, the subsi 
on which data analysis was executed was 4801 
dents. 2 
Table 2 summarizes the age by sex cha 
istics of the data-analysis sample. 
The upper-middle-class portion of the sam 
was drawn from a suburban school system (fi 
a city coded OP) where the median years 
schooling completed by persons 25 years of 
or older was 12.4. Less than 3% of the po’ 
male labor force was unemployed. The m 
family income was $8,657. 
The lower socioeconomic portion of the 
was drawn from the inner city area of @ 
metropolitan city (coded D) where the m 
years of schooling completed was slightly less 
9. Over 20% of the male population wi 
employed, and the median family income sl 
less than $3,500. At least half of the families 
latter group could be defined as poverty fal 
by Office of Economic Opportunity standards, 


Instrumentation 


The Ss were asked to rate on a 6-point s¢ 
the degree to which they valued, and later,” 
degree to which they liked, eight different sch 
activities ranging from learning about science 
nature to playing games or sports. Four rat 
pertained to the academic aspects of school, 1 
pertained to the social aspects of school, 
rating scales were part of a larger questionm 
given in two sections, 1 week apart (see Mi 
Bloom, & Dunn, 1961). Anxiety was me 
with a modified version of Sarason’s Test 
Scale, a scale which factor analysis has indi 
to be a more appropriate measure of school a! 
ety than of simple test anxiety per se ¢ 
1964). 


Data Analysis 


A three-factor analysis of variance (Winer, 
was used to determine the main and inte 
effects of age, sex, and social class. Filter me 
analysis, a partitioning program developed 
use by the University of Michigan Institute 


2 "This information was derived from the 10 
United States Census data. 
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Social Research, was then used to determine the 
precise nature of the interaction effects that proved 
to be significant. Because it iterates through all 
| orthogonal combinations, filter means analysis may 
be used to isolate the source of interaction effects 
previously found to be significant through tradi- 
tional analysis of variance. 


RESULTS 


Analysis of variance results are sum- 
marized in Table 3; eight of the nine 
analyses of variance yielded significant F 
ratios. Figures 1-3 summarize some of the 
more interesting data trends, 

Lower-class children indicate a higher 
level of school anxiety than middle-class 
children (significant at .05 level). This re- 
sult was due almost entirely to the re- 
sponses of children at the elementary 
school level, however (see Figure 1). The 
differences in anxiety between adolescents 
and preadolescents was not significant. An 
inspection of the group means suggests a 
trend in the predicted direction, however. 


With regard to the predicted sex dif- 
ferences, the converse was found, Girls 
were found to have a significantly higher 
(05) level of school anxiety than boys. 
Filter means analysis indicated this 
ANOVA main effect to be due, in large 
measure, to differences at the seventh grade 
level. There were, in fact, counter-indica- 
tions at the fifth grade level. 

Concerning affect-value patterns, it was 
found that as students grow older, they 
like all aspects of school, social as well 
as academic, less and less (significant at 
the .01 level), This was true for the most 
part regardless of sex or social class, The 
rate of decrease in affect for the academic, 
as well as the social, aspects of school was 
less for lower-class children, however. 
Also, as upper-middle-class children grow 
older, they tend to devalue, as well as dis- 
like the academic and social aspects of 
school (significant at the .01 level). This 
was not the case for lower-class children. 


TABLE 3 
Summary or Hyporuesss aND Resuits 
Variables Hypotheses aos Comment 
Age Adolescents will: a 
dislike academies 01* WE Nae, 
i 01 Middle-class adolescents showed a signifi- 
Ms: i cant decrease in value for academics. 
; 
Sex ; “1. é : 
eee 05 Preadolescent girls liked academics more 
than boys. i 
i irls claimed they valued academics more 
vais oa a ee boys; areeacleatent girls claimed 
they valued academics more than any 
other group. 
Social class | Lower-class children will: 
ass F rf . -class children report greater posi- 
disk a Ny ike affect for academics than the mid- 
dle-class children. 
value aendemted 01> Confirmed for lower-class adolescents only. 
[gta al drama a areata a nS 
Anxiety High anxiety level for: rie ay fe 
adolescents ignificant difference at the seventh-grade 
males He gaat where males have lower 
anxiety scores than females. 
lower-class children 058 


VANOVA main effect. 
ANOVA interaction effect. 
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Fia. 1. School anxiety levels. (Note, in Figures 
1-3, * signifies ANOVA differences significant at .05 
level; ** signifies significant at .01 level; values 
between vertical arrows signify levels of signifi- 
cance of filter means differences.) 


There was no decrease in their reported 
level of perceived value for the various 
aspects of school. 

Regarding sex differences, elementary 
school girls reported that they liked the aca- 
demic aspects of school more than ele- 
mentary school boys (.05 level), but there 
was a steady decline with age in both (a) 
the degree to which girls liked the aca- 
demic aspects of school (.01 level) and 
(b) the degree to which they valued them 
(.01 level). Girls, however, tended to re- 
main higher than boys in the value they 
placed on academics. Interestingly, there 
were no sex differences in the degree to 
which boys or girls in this sample valued 
or enjoyed the social aspects of school. 
There was a decrease with age, though, 
in the degree to which girls reported lik- 
ing the social aspects of school (.01 level). 

Regarding lower-class differences, lower- 
class adolescents reported they both valued 
and enjoyed the academic aspects of school 

more than upper-middle-class children 
(01). This was true for all grade levels 
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and for both sexes. They also claimed 
value the social contacts more (.01), 

There were no social-class differences 
the degree to which children enjoyed the 
social aspects of school, however. Judging 
from these results, it is possible that { 
social aspects of school are largely in¢ 
pendent of the value structure of scho 
authorities and are, in fact, far more 
the hands of the peer group than in 
adult authorities. 


Discussion 


In view of the author’s contention 
human motivation, as far as school 
havior is concerned, should be given m 
dimensional consideration, and that an im 
dividual’s motivational state at any poit 
in time probably involves a compl 
hierarchy of a good many approa¢ 
avoidance values, the success that 
achieved with a simple two-value par 
digm was surprising. Success was M 
though. It is probably the case that, 
far as adolescents are concerned, there § a 
many other school factors far more sa 
for the determination of school appro 
avoidance than those chosen for the p 
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Fra. 3. Value for academics. 


around the issue of social-class differences, 
however, the affect-value dimensions seem 
to take on much greater saliency; and 
hence are much more intimately associated 
with school anxiety. 

The age and sex findings of the present 
study, with regard to attitudes toward 
school are by and large consonant with 
earlier findings. It is the social-class find- 
Ings of the present study that are of 
pel interest. Especially in view of the 
tg that, they challenge some of the stereo- 
pes notions that have long been enter- 
‘ained with respect to lower-class children 
7" Middle-class school settings. It appears, 
si example, that the lower-class child 

oth appreciates and values the academic 
se of school much more than he has 
ee given credit for in the past. In ad- 
-s ion, he also is apparently very much 
>a with doing well, at least as far 

is is suggested by school anxiety. 
; hese last findings, with regard to social 
‘che and anxiety, have since been cor- 
Tnated by Sheila Feld® at the National 
aitute of Mental Health. Dr. Feld has 

80 found lower-class children to have & 
——— 


a 
Personal communication, 1966. 


higher degree of test anxiety than middle- 
class children.* 

Two separate hypotheses regarding why 

this should be so may be suggested. One 
holds that the lower-class child’s school 
anxiety is, in fact, reality oriented inas- 
much as he typically has met with a high 
degree of failure in school activities, hence 
confrontation with further possible failure 
is anxiety arousing. The other explanation 
holds that for the lower-class child educa- 
tional success is a necessary requisite for 
upward mobility, thus more of his future 
is at stake in school and in testing situa- 
tions than is the case with the middle-class 
child. 
If only school anxiety scores are in- 
spected, it would appear that the former 
hypothesis has the edge. But if value for 
academics is also considered, one is met 
with the peculiar pattern of lower-class 
values for academic pursuits remaining 
reasonably high whereas middle-class val- 
ues for academics fall off drastically at 
adolescence, and especially for males. 

It is possible that middle-class adoles- 


«Phillips, in a study reported after the present 
study was completed, found similar results, 
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cents increasingly see the academic as- 
pects of school as having less and less bear- 
ing on their eventual vocational success. 
This is not the case with middle-class fe- 
males, however, who presumably are 
grooming themselves, at least temporarily, 
for a career. Whereas a middle-class male 
has certain social factors such as parental 
support, the possible entry into the father’s 
business, and the like going for him, girls 
must compete in the professional market- 
place on their own merit alone. Thus, 
middle-class females could be expected to 
be, and are, more like lower-class males in 
the degree to which they value the aca- 
demic aspects of school than they are like 
middle-class males, 

By way of summary, then, the results 
of the present study suggest that: 

1, As children grow older they increas- 
ingly dislike both the academic as well as 
the social aspects of school. 

2. In elementary school, girls like and 
value academics more than boys, but these 
sex differences disappear as children grow 
more and more to dislike and devalue the 
academic aspects of school. 

8. Lower socioeconomic children at all 
ages and both sexes report liking the aca- 
demic aspects of school more than upper- 
class children. As they grow older and 
move into adolescence, lower socioeco- 
nomic class children continue to value the 
academic aspects of school whereas their 
upper-middle-class counterparts come to 
increasingly dislike and devalue them. 

4. As lower-class children grow older 
they report that they also value the social 
aspects of school more than upper-middle- 
class children. 

5, Lower-class children, especially in 
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the elementary grades, give much more 
evidence of being anxious with regard to 
doing well in school than do their middle- 
class counterparts. 
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ACADEMIC PERFORMANCE WITH, AND WITHOUT, 
KNOWLEDGE OF SCORES ON TESTS OF 
INTELLIGENCE, APTITUDE, AND 
PERSONALITY? 


ALFRED J. M. FLOOK? anp USHA SAGGAR? 
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The relationship between academic 


1 performance and knowledge of 
test scores is examined. The 60 Ss 


‘ i were 1 year’s intake of students 
into engineering courses at the University of St. Andrews, Scotland. 
groups: Ss in Group K were 
given detailed knowledge of their test scores; Group NK received 
no such knowledge. In end-of-year examinations, Group K performed 
better than Group NK (p < .001). This finding is discussed with 
particular reference to (a) the role of anxiety in academic perfor- 
mance, (b) Atkinson's theory of achievement motivation, (c) the part 
played by other correlates of academic performance, and (d) the 
optimum “feedback” of psychometric information to university stu- 
dents and other learners. Group K’s superiority is interpreted as 
originating in improved self-evaluation through social comparison, 
with knowledge of test scores acting catalytically. 


Take th 
ag 


It is sometimes suggested that examina- 
tion results suffer if students are at some 
earlier stage informed of their scores on 
tests of ability, aptitude, or personality. 
According to this argument, the low scorers 
become demoralized and the high scorers too 
Complacent, so that the later academic 
Performance of both is inferior to what it 
Would have been had they been kept in 
ignorance of their test scores. In the United 
Kingdom and elsewhere this is a common 
objection to the disclosure of test scores to 
Students, In support of it, anxiety level is 
Usually postulated as being the key vari- 
able intervening between knowledge of test 


Scores and academic performance. Against 


the background of studies (eg, Lynn & 


Gordon, 1961; Savage, 1962) suggesting 
that the relationship between anxiety and 
Performance obeys the curvilinear Yerkes- 
————— 
Sine’ authors thank the Faculty of Applied 
“ence in the University of St. Andrews for its 
} stnted assistance and tolerance throughout the 
Yeats in which data were being collected. Also, 
f *Y Would like to express their special gratitude 
a the 154 students who gave up a considerable 
tiv ttt of their time, most of it to endure inten 
ene, testing and interrogation, in order pos 
Tough of their private and academic lives to 
e Investigation possible. y 
, Now at the University of Dundee, Scotland. 
R Now at the Department of Clinical Psychology, 
val Dundee Liff Hospital, Dundee, Scotland. 


They were divided into 2 matched 


Dodson principle, it is held that students 
who know their test scores are low are 
impeded by overanxiety in their later 
work; whereas those who know their scores 
are high, in their self-satisfaction, fall be- 
low that “happy medium” level of anxiety 
that spurs them on to their best perform- 
ance. 

Though plausible, the argument is open 
to the objection that the crucial factor is 
the form in which the feedback informa- 
tion is given. A mild version of this coun- 
terclaim asserts merely that “good” pres- 
entation cancels out any ill effects, while 
a stronger version claims that the feedback 
may take a form that outweighs any ill ef- 
fects, thereby producing a net gain. 

“To tell or not to tell” is a question of 
both practical and theoretical importance, 
yet reported attempts to obtain an experi- 
mental answer are lacking. The present 
exploratory study was included in a larger 
investigation (Saggar, 1961) in an attempt 
to resolve the question experimentally. 
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Hypotheses and Experimental Design 


The broad research hypothesis to be tested is 
that knowledge of scores on tests of ability, ap- 
titude, and personality results in poorer academic 
performance. The Ss were divided into two groups, 
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between which the differences on all the test 
variables were insignificant (p > .20). 

The two subhypotheses are that knowledge of 
test scores lowers the subsequent academic per- 
formances of students with (a) high scores and (b) 
low scores on the tests. The two groups were 
formed in a way which permitted the subdivision 
of each into three comparable sections, made up 
of matched pairs of Ss with scores in the top, 
middle, and bottom thirds of the distribution of 
intelligence-test scores. 


Subjects 


The Ss were 60 students (1 female and 59 male) 
admitted to engineering courses at the University 
of St. Andrews in October 1960. Of the complete 
intake of 62 students, only 2 declined to take part 
in the investigation. One of the 60 volunteers did 
not complete the degree examinations in June 
1961, which were used as the criterion in this ex- 
periment. Consequently, he and his pair in the 
other group were both omitted from the analysis 
of the results. Except where indicated, therefore, 
the results given and discussed in this paper relate 
to 58 Ss in two matched groups. However, the ex- 
periment formed part of a broader inquiry 
covering the intakes of the previous 2 years as 
well, and in the present paper reference is made 
explicitly to the collective data for the three con- 
secutive intakes, involving 154 students, where 
further insight seems to be obtainable from them. 


Materials 


The tests administered were the AH5 Group 
Test of High-Grade Intelligence (AH5; Heim, un- 
dated), the Engineering and Physical Science 
Aptitude Test (EPSAT; Moore, Lapp, & Griffin, 
1943), and the Maudsley Personality Inventory 
(MPI; Eysenck, 1959). Two specially designed 
questionnaires were also used, one concerned with 
methods of study, and the other with biographical 
data about each 8, his home environment, motiva- 
tion, worries, etc. Additional personal information 
was obtained in a series of individual interviews. 


Procedure 


The general situation regarding the collection 
of data was explained beforehand to each $ in a 
personal note from the Dean of the Faculty of 
Applied Science. In this he made it clear that the 
Faculty was sponsoring the investigation, that 
participation in it was voluntary, and that infor- 
mation obtained about Ss at all stages of the in- 
my ieee be treated in strict confidence. 
e data were collected in the followi: es: 
Administration of tests and Bib gepASeur ie ques- 
tionnaire. This was done at the start of the first 
term in October 1960. When all the tests had been 
scored, Ss were divided into two matched groups 
of 30 students, These then passed through the 
stages described below, except for the one § who 
did not complete the degree examinations, 
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First basic interview. Early in the second { 
all Ss were given an individual interview, 
which they answered questions on me 
study. At the start of their interview, those 
Group K were given their own scores on 
the three tests, together with enough no 
data for them to be able to see how they 
in relation both to the original stand 
groups and to their own classmates (ie, 
group of 60 entrant students). A special 
was made to present this information in a way 
would be encouraging rather than intimid: 
without being misleading. It was pointed out 
the association between ability and acade 
success, while positive, is not overwhelm 
strong in a highly selected university populat 
so that at that level other factors (strong mo! 
tion, good study methods, hard work, etc.) 
a special importance. The other Ss, Group 
were told that no knowledge of their test 
could be given to them for the time bein 
that variable was being controlled in order 
its effect on performance could be determined, 

Supplementary interview on methods of stu 
Later in the second term, an interview was 
to any S of either group who had asked for 
advice in the first interview. § 

Second basic interview. Early in the third 
all Ss were questioned on their performan 
the class examinations at the end of the § 
term, and various data necessary for the main 
of the investigation were collected. 

A striking feature of all the interviews wa 
great willingness of Ss to discuss their pél 
problems. It seems clear that the junior aul 
who conducted the interviews, had gained 
confidence in the manner of an effective stu 
counselor. 

First-year degree examinations. In June 
all Ss except one took degree examination 
mathematics, physics, and chemistry. The m 
for these three examinations were normalized 
a mean of 55 and a standard deviation of 
then added for each student to form a cri 
score. This composite mark is used in Tab 
to compare the examination performance of 
two groups. s 


RESULTS 


Main Hypothesis 


The main hypothesis is that know 
of test scores does result in poorer 
demic performance. Its opposite is 4 ¢ 
pound of two alternative countercla 
that knowledge of test scores makes no 
ference in academic performance (the 
hypothesis); or that it results in 
performance. Consequently, there we 
effect, two research hypotheses availa 
predicting deviation from the null hypo! " 


Acapesic PerrorMaNce AnD KNowLEncE oF Test Scores 


sis in opposite directions. To decide be- 
tween them, the null hypothesis was tested 
by the two-tailed version of statistical tests 
appropriate to the matched-pairs experi- 
mental design. 

The outcome, detailed in Table 1, is a 
highly significant difference in favor of 
Group K, its probability being either a 
little below or above the .001 level, accord- 
ing to the test applied. 


Subhypotheses 


The subhypotheses are that knowledge 
of test scores lowers the subsequent aca- 
demic performance of high-scoring and 
low-scoring Ss. The results, detailed in the 
lower rows of Table 1, were: 

Subjects with high test scores. The dif- 
ference in performance of Ss with high test 
Scores was not significant, according to the 
two-tailed tests used. However, it favored 
Group K, not Group NK, and it almost 
teached the .05 level of significance for the 
one-tailed versions of the tests. 

Subjects with low test scores, The same 
trend was unmistakable for the 10 low- 
Scoring pairs. Here the 10 Ss who had been 
told their test scores proved decisively su- 
Perior to their counterparts in Group NK, 
the difference easily exceeding the .01 level 


TABLE 1 
Groups K ann NK (anp Tuer SEctions) 
Comparzp on Crrrerton Scores 


Group K |GroupNK]| Comparisons 
Subjects [—_—_$<_ | ——__ 
M|SD|M|SD|T™| & 


Whole groups (V = 
cease 178|19.1/156)24.1/61*4|3.80** 
ectionse 
Top (N = 10 
Middle (V = 9) 
Bottom (W = 10) 


182/19.0}160/32.2/13 1.72 
176|21.6|158)19.4} 9 1.60 
176]16.4|149|15.8) O* |4.47*° 


*Two-tailed Wileoxon matched-pairs signed- 
Tanks test. 
% Two-tailed ¢ test for paired samples. 
ae Of the two groups in terms of AHD scores, 
Heh hy 5442, 41-36, and 36-23, respectively. 
P= 0014 
a * With 9 dj, a two-tailed p of 0.001 corresponds 
Oat of 4781, 
P< O1, 
P< 001. 
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of significance and nearly reaching the .001 
level. 

There are two general features that de- 
serve comment. First, although only two 
of the four differences reported in Table 1 
reached significance, all four were in the 
direction of Group K’s superiority to 
Group NK. Second, the bottom section of 
Group K had a mean criterion score equal 
to that of the middle section of Group K, 
and larger than that of every section of 
Group NK. In other words, there is an in- 
dication that, while knowledge of test 
scores conferred an advantage throughout 
the range of ability in Group K, its im- 
pact was greatest on the bottom third, with 
the lowest test scores. (The possible con- 
nection between this and the fact that only 
2 years previously the first-year failure 
rate had climbed to 33% is one of the 
questions considered.) 


Discussion 


Considering how few Ss were available, 
the result is remarkably clear-cut. This 
highly significant difference must have 
come entirely from an above-normal per- 
formance by Group K, if Group NK can 
justifiably be regarded as a control group. 
That seems a reasonable view, as test scores 
are not normally available to these students. 
However, the difference could conceivably 
have arisen entirely from a below-normal 
performance by Group NK, or it might have 
been a joint effect in Group K’s favor 
which emerged because the performance of 
both groups had deviated, up or down, 
from the norm of previous years. To 
clarify the point, the failure rate for 
1956-60 was analyzed. The information 
available was not such as to allow an 
absolutely definitive conclusion to be 
drawn, but the analysis did suggest: very 
strongly that Group K had performed 
above the recent norm, as well as signifi- 
cantly better than Group NK, as a result 
of having been told their test scores. 

Since such an outcome is so satisfactory 
educationally, and as test scores are nor- 
mally available nowadays for so many 
students and other “learners,” the question 
of the generality to be granted to this re- 
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sult has special practical importance. The 
Ss came from a narrow band of the educa- 
tional spectrum. Not only were they all 
university students, drawn from one uni- 
versity, but they were all studying engi- 
neering, and all but one were male. How- 
ever, they may reasonably be regarded as 
being a representative sample of that 
special group, for they comprised all but 2 
of a year’s intake of 62. Furthermore, there 
is no obvious reason why the result should 
be unique to such students, but the extent 
to which it can be generalized beyond them 
is a question needing further study. 

Another striking feature of the result is 
its unexpectedness. Not only does it run 
counter to the opinion prevalent in Great 
Britain that knowledge of test scores de- 
presses subsequent academic performance, 
but also it goes beyond the intermediate 
view that this factor is immaterial, by 
creating the strong presumption that it has 
a facilitating influence. Such a “booster” 
effect, if confirmed in replications, would 
have theoretical and practical implications 
that make it important to consider possible 
explanations. 


The Role of Anxiety in Academic 
Performance 


Previous research suggests a curvilinear 
relationship* between anxiety and perform- 
ance, with differential anxiety as the 
mediating factor. Even though no direct 
measure of anxiety was available for Ss of 
this experiment, it is obvious that its re- 
sults refute a simple differential anxiety 
hypothesis. Evidently, some more complex 
role for anxiety as an intervening variable 
is needed if it is to fit the facts—a point 
stressed by other workers in this field (eg. 
Stein, 1963). i 


Atkinson’s Theory of Achievement 
Motivation 


One of the purposes of Atkinson’s (1957) 
theory is to account for performance level 
when only one task is presented to Ss in a 
competitive setting. That was the situa- 


“In the form of an inverted-U curve, like that 
shown in Figure 1, if anxiety is represented 
by the abscissa. 
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tion confronting Groups K and NK, whose 
members were taking a compulsory curric- 
ulum. Most of them may be assumed, in 
view of the long record of academic success 
necessary to gain admission to a British 
University, to have been stronger in the 
motive to achieve success (Msg) than in the 
motive to avoid failure (Muy). And At- 
kinson, in applying his theory to persons 
in whom Mg > Mar, concludes that the 
strength of motivation to perform a task 
(when no alternatives are offered) should 
be greatest when such persons are most un- 
certain about the result, that is, when the 
probability of their success (P,) is .50. And 
so, performance level, through which 
strength of motivation is expressed, should 
be a bell-shaped function of P, (as shown 
in Figure 1) when differences in ability are 
controlled, as in the matched-pairs design 
used for this experiment. 

There is evidence (Atkinson, 1964; At- 
kinson & Feather, 1966) that individual 
differences in intelligence or aptitude may 
serve as cues to define a person’s P, in & 
competitive academic setting. Conceivably, 
the members of Group K, having a clear 
indication of their relative standing on the 
tests, developed sharply defined P, values 
scattered widely on either side of the criti- 
cal central value. By contrast, the mem- 
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Fic. 1. Strength of motivation to achieve, OF 
avoid failure, as a function of subjective Pr! ‘i 
bility of success. (A slightly modified form ts 
Figure 1 in Atkinson, “Motivational Determinay’ 
of Risk-Taking Behavior,” Psychological Review 
Vol. 64, 1967, 359-372). 
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bers of Group NK were all alike in their 
jgnorance of their test scores, and that 
greater uncertainty could have had the 
centripetal effect of inclining them to 
duster closer to a P, of .50. Under such 
conditions, the theory would lead one to 
expect that Group NK would perform bet- 
ter than Group K. However, the informa- 
tio available on the motivational dy- 
namics of the experimental situation is 
ambiguous enough for the theory to yield, 
on other assumptions, a prediction in favor 
of Group K—in line with the actual re- 
sult, The possibility of such contradictory 
speculation highlights the essential point 
that the clear need now is for a series of 
experiments in which the crucial interven- 
ing variables in a complex situation are 
_ identified, measured, and controlled in the 
kind of systematic and penetrating analy- 
sis which the present exploratory study, 
made under field conditions, was not in- 
tended to provide. 


The Part Played by Other Correlates of 
Academic Performance 


As already indicated, the experiment 
| formed part of a broader inquiry into the 

telationship between academic perform- 
ance and many factors presumed to be 
Influential independent variables. Some of 
these were, in fact, found to be correlated 
with academic performance; and it is con- 
telvable that Group K’s superiority came 
‘om a membership more favorably en- 
dowed than Group NK’s with these at- 
tributes, exerting an influence either ante- 
tedent to, or concurrent with, knowledge of 
test scores, 

Five variables found to be associated 
“ghificantly (Numbers 1, 2, and 3 at the 
Level) with academic success were: 
_ lL. The $’s report that his work was not 
interfered with by daydreaming. 
ide Normal home background (opera- 
“nally defined as one which, according to 
» Contained none of the domestic abnor- 
Tnalities on the list used in this inquiry). 

3. The 9g report of freedom from fi- 


~Parent’s income being in the top, 
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rather than the middle, section of a spec 
ified 3-part range, 

5. Parent’s occupation falling into 
Grades I and II, rather than II-VI, of 
the Hall-Jones scale (Hall & Jones, 1950). 

Of these, the first alone may have given: 
some advantage to Group K, whereas the 
other four tended to favor Group NK. It 
seems therefore that, at the very least, the 
matching of the two groups put Group K 
in no stronger position than Group NK in 
terms of these five variables, viewed as a 
whole, 

There remains the possibility of explain- 
ing Group K’s superiority in terms of vari- 
ables intervening between Ss’ being told 
their test scores and the degree examina- 
tions nearly 6 months later. The first in- 
dication of such a chain of events is to be 
seen in the unequal advantage taken by Ss 
of the availability of advice on methods of 
study. The striking difference is that 24 of 
Group K, as against 11 from Group NK, 
asked for the help in this direction which 
had been offered impartially to all 58 Ss. 

Further light is shed by Ss’ answers to 
the question, “Approximately how many 
hours do you devote daily to your studies, 
that is, apart from the university hours?” 
Some gave their answer there and then, but 
others required time to work it out. The 
latter reported theirs 2 or 3 weeks later. 
All the data are summarized in Table 2. 


TABLE 2 
Groups K anp NK Comparep on Hours Span 
Weexty on Private Srupy, Accorpine To 
Srupgns’ Own Rupiizs TO QuEsTIONS 
in In?rERVIEW 


es Group K | Group NK iene 
Re _——$—$—— 
u | sv| wu | so| # 
Temediately 15.2| 8.31] 14.52] 3.35.51 
Later 24.44] 3.95] 13.0°| 3.86) 8.14" 
Total 20.2 | 5.79] 13.6 | 3.76] 5.08* 
* Two-tailed test. 
dN = 13. 
eN = 12. 
aN = 16. 
°N = 17. 
*p < 001. 


an 


These figures raise several questions, but 
the most plausible interpretation seems to 
be that knowledge of test scores had a de- 
layed-action effect on work habits, induc- 
ing the 16 members of Group K who replied 
later to increase their working week in the 
light of their sober reflections on the aca- 
demic future painted for them by their test 
scores. If so, a similar reassessment may 
have been made subsequently by their 13 
fellow members who had replied on the 
spot, which would, of course, have swollen 
the overall difference between the two 
groups in hours of study. 

All this suggests that the simple expla- 
nation for Group K’s superiority is that 
collectively they worked harder and more 
effectively than Group NK, but that is an 
unenlightening tautology. Greater justice 
would be done to all the evidence available 
by suggesting that knowledge of his test 
scores gave each member of Group K a 
clearer and more realistic picture of his 
academic possibilities than that possessed 
by his counterpart in Group NK, and that 
this sensitizing experience moved him to 
take better advantage of the general context 
of help that was equally available to both 
groups. In other words, the impetus to- 
ward superior performance may have been 
supplied by improved self-evaluation 
through social comparison, which has been 
proposed as a major motive underlying 
social behavior (Latané, 1966). The fact 
that a number of Group K’s members 
asked spontaneously for the norms for their 
own classmates immediately after being 
given the original standardization norms 
lends empirical support to this suggestion. 

Clearly, once this psychometric informa- 
tion had been given, advice on study meth- 
ods was a key variable, but it was only 
one ingredient in a complex of supporting 
factors. Another important component was 
the almost continuous accessibility of the 
junior author to Ss for the discussion of 
their work and problems. Being Indian and 
a woman may have helped her to establish 
close rapport, which itself might suggest 
that the results of the experiment reproduce 

the “Hawthorne” effect. That seems doubt- 
ful, however, since the same interest was 
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taken in both groups throughout, the 
difference being the quantitatively m 
one involving the disclosure of test score: 

The care taken to give Group K their 
test scores in a positively energizing m: 
ner was also intended to be a facilitati 
ingredient. In this respect, it is notable th 
it was the bottom third of Group K_ 
terms of AH5 scores) who contribul 
most to the overall superiority of th 
group, and the first-year failure rate 
vital fact of academic life which most en- 
trant students get to know before, or soon 
after, admission) had reached 33% only 
years previously. The Ss in the bol 
section of Group K were, therefore, in p 
session of full information from which 
they could conclude that they were in the 
academic danger zone. i 

In summary, the interpretation 
gested is that knowledge of test sco 
acted like a catalyst, setting up in Group 


ance. It did so by first clarifying their re 

tive standing in academic 
thereby creating an informed cone 
about future performance that impelled 
them both to work harder and to exploit 
fully the facilities available to all for 
maximizing academic success. This assigns 
a central activating role to one aspect ol 
anxiety (the phrase “informed concert 
being intended to convey that Group & 8 
alertedness took a predominantly facilite 
ing form), but this is speculation, as | 
experiment included no measure of anxi 
specific to academic performance. 


Further Research Needed 


Several questions have been brought up 
that point to a need for the experiment 10 
be replicated and systematically extend 
Its theoretical implications call for a pen! 
trating exploration of the complex soci#! 
context of academic performance to elu 
date the contributions of psychometnio 
information, of anxiety, and of their meh 
play with each other and with other van 
ables in the educational setting. Analysis 
along similar lines would be necessary ' Ki 


clarify the relationship of this experimen™ } 
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finding to Atkinson’s theory of achieye- 
ment motivation. 

As for its practical significance, the find- 
ing suggests that the wealth of psycho- 
metric data on record about students and 
other learners in so many parts of the 
world may have a productive potential that 
has been overshadowed by the diagnostic 
and prognostic functions that are now 
established traditions. Such a claim would 
rightly provoke a demand for cautious 
application, in view of the issues still un- 
resolved. One crucial question is: What 
forms should psychometric “feedback” 
take so as to produce the best effects? 
Others are facets of the problem of gen- 
erality. If. confirmed on similar popula- 
tions, how far could this finding be gen- 
eralized? Is it specific to certain age or 
ability levels, to certain subjects of study, 
to males rather than females? In view, too, 
of the evidence (McClelland, 1961) for 
cultural differences in achievement moti- 
vation, is it a phenomenon peculiar to cer- 
tain societies and absent from others? It is 
clear that the finding has implications, 
both practical and theoretical, calling for a 
Comprehensive understanding of the social 
Complex in which the effect was embedded. 
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MENTAL RETARDATION, MENTAL AGE, AND 
LEARNING RATE 


ARTHUR R. JENSEN anp WILLIAM D. ROHWER, JR. 
University of California, Berkeley 


Zigler’s hypothesis that mental age (MA) and not IQ determines the 
tate of learning is examined in the light of empirical evidence compar- 
ing the learning rates of normal and retarded children and young 
adults matched for MA. The results show that learning rate is a 
function of IQ as well as of MA. In general, children of average IQ 
learned serial and paired-associate lists significantly faster than re- 
tarded young adults with IQs between 50 and 60 but with approxi- 
mately the same MA as the children. An interaction between IQ, 
learning rate, and socioeconomic status is also noted. 


Zigler has now stated (1967a) and re- 
stated (1967b) a central theme of his 
theoretical position regarding mental re- 
tardation that “...it is the MA [mental 
age] (level) and not the IQ (the rela- 
tionship of MA to chronological age) that 
determines the exact nature, including the 
rate, of learning any task [1967b, p. 579].” 
Thus, two persons of different chronologi- 
cal age (CA) and different IQ but matched 
on MA should show similar learning rates. 

Weir (1967) has challenged Zigler’s 
statement on essentially the following ba- 
sis: If MA is a measure of the knowledge 
an individual has accumulated by a given 
CA, the rate of acquisition of this knowl- 
edge is represented by the IQ, which is 
(MA/CA) x 100. Therefore, contrary to 
Zigler’s position, persons of the same MA 
but differing in IQ should show different 
rates of learning, even in short-term learn- 
ing tasks. There is evidence that Weir’s 
prediction is indeed borne out in the case of 
laboratory learning tasks. 

The obscurities in the argument between 
Zigler and Weir can be overcome by mak- 
ing a conceptually clear-cut distinction be- 
tween developmental rate and learning 
rate. There is much evidence (White, 1965) 
that mental abilities have a hierarchical 
structure, the development of which fol- 
lows a chronological sequence; the mile- 
stones of this developmental sequence are 
marked by the increasing complexity of 
the cognitive structures (e.g., heuristics, 
symbolic mediators, strategies, information 
processing skills) which the individual can 
bring to bear on solving problems. The ages 
at which individuals attain these stages 


of cognitive development are regarded 
as indexes of developmental rate. But two 
individuals who are at the same develop- 
mental stage and who have arrived at this 
stage at either the same or at different 
rates of development, may still differ in 
the rates at which they can acquire new in- 
formation. This is distinguished as learn- 
ing rate. Thus, individuals can be re- 
tarded or normal in developmental rate 
and retarded or normal in learning rate. 
Retardation in either realm will spell Te- 
tardation as assessed by traditional in- 
telligence tests, since these are a mixture of 
items that measure acquisition (eg. Vo 
cabulary and general information sub- 
tests) and cognitive structures (e.g., prob- 
lems involving logical reasoning). The 2 X | 
2 combinations indicated by this formulé- 
tion suggest three possible classifications 
of familial retardates. Normal developmen: 
tal rate and normal learning rate are ee 
necessary for the manifestation of norma 
intelligence, as traditionally defined; né- 
ther alone is sufficient. A 
Our data pertain only to the relationship 
of MA to learning rate. No inferences a! 
made here concerning the issue of develop: 
mental rate. are. 
Jensen (1965) matched 40 institution” 
alized mentally retarded young adi fi 
(mean IQ = 58) with no known a 
defects with 40 normal school chil ee 
(mean IQ = 105) on MA (9 years). 
both serial and paired-associate rote ee 
ing, the normal children had learning '@ | 
some 3 to 4 times faster, on the ae 
than the adult retardates. wort ae 
although there was no significant 4 
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children 
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ence in the standard deviations for MA in 
the two groups, the retardates showed a 
significantly greater standard deviation of 
learning scores than the normals, The 
greater heterogeneity of learning rates of 
groups of retardates as compared with nor- 
mals, when the groups are equally homo- 
geneous in IQ and MA, was further sub- 
stantiated in a study comparing learning 
rates in retarded, average, and gifted chil- 
dren (Jensen, 1963). There are evidently 
more ways of being retarded than of being 
either average or gifted in mental ability. 
Rohwer (1967) compared a group of 48 
institutionalized familially retarded adults 
with groups of normal children in Head 
Start and kindergarten and in Grades 1, 8, 
and 6 on paired-associate learning. The 
children were sampled from populations of 
low- and middle-socioeconomic status 
(SES). (The MA is close to the CA for the 
school children, but is slightly lower in the 
low-SES groups.) The results, shown in 
Figure 1, indicate that the average learn- 
ing score of the retardates is significantly 
lower than that of any of the other 
gtoups as well as being significantly lower 
than all the other groups combined (F = 
103.22, df = 1/396, p < .01), Comparison 
of the learning performance of the adult 
Tetardates and the middle-SES third grad- 
&s is especially revealing, since the two 
ftoups have approximately the same MA 
(9.7 versus 9.6). Also, there was a larger 
standard deviation of learning scores in the 
Tetarded group than in any of the normal 
groups, 
The relationship between learning rate 
tnd MA, at least in the mildly retarded 
(ie, IQs of 50 to 75), is further compli- 
tated by socioeconomic status. Rapier 
(1968) closely matched Caucasian middle- 
thd low-SES elementary school children 
= 20 in each group) in classes for the 
Tetarded on CA (124 months), MA (88 
Months), and IQ (70). None of the Ss 
‘vinced any organic defects. The low-SES 
showed consistently and signifi- 


fantly faster rates of paired-associate 


“ating than the middle-SES children. 

. 0 view of the present results and con- 
‘stent with our conceptualigation, equiva- 
“tee of developmental level need not 
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[23] Low SES 
hy Mid SES 
Retarded 


Adults 


or aSRSR SB 


Correct Responses in Two Trials 


+ 


oLES 3 
Grode Hd. St. & 1 3 Ret, Adults 
Ki ten 
caclow SES 531 6.92 897 12.06 CA= 25.56 
Mid SES 5.32 660 B59 11,60. MA= 9,70 


Fra. 1. Comparisons of low- and middle-socio- 
economic groups of children at various grades in 
school with institutionalized retarded adults on 
paired-associate learning consisting of 24 picture 
pairs presented two times at a rate of 4 seconds 
per pair. N = 48 in each of the nine groups. 


imply equality of performance on intellec- 
tual tasks, specifically, learning tasks. 
When equal-MA comparisons involve nor- 
mals and familial retardates, differences in 
learning rate are to be expected, and, in- 
deed, are found. 
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EVALUATING THE STUDY HABITS AND ATTITUDES oF 


WAYNE H. HOLTZMAN 
University of Texas 


HIGH SCHOOL STUDENTS’ 


AND 


Consisting of 100 items similar to those in the college version of the 
Survey of Study Habits and Attitudes (SSHA), Form H was de- 
veloped and standardized on 11,218 students in Grades 7-12 from 
school systems in Texas, Colorado, Illinois, Maryland, Missouri, 
and Utah. Complete data, consisting of the SSHA, Form H, a scho- 
lastic aptitude test, and a grade-point average based on subsequent 
academic performance, were available for 10,888 students. Validity 
coefficients consisting of correlations between the total SSHA score, 
Study Orientation, and grade-point average ranged from 32 to 66, 
with an average value of .49. Correlations between the scholastic 
aptitude test scores and grades were only slightly higher, ranging 
from .19 to .83 with an average of 57. The low correlation between 
SSHA and scholastic aptitude (—.04-.54 with a mean of 27) indicated 
that the SSHA measures important traits related to school achieve- 
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ment that are untouched by the standard scholastic aptitude tests. 


When first developed, the Survey of 
Study Habits and Attitudes (SSHA; Brown 
& Holtzman, 1953) was designated primar- 
ily for high school seniors and college 
freshmen. Consisting of 75 items, the orig- 
inal SSHA was standardized on samples 
comprised of nearly 4,500 freshmen en- 
rolled in 11 different colleges and several 
hundred high school seniors. Scores on a 
suitable scholastic aptitude test and grade- 
point averages for the semester following 
administration of the SSHA were available 
to determine the validity of the SSHA for 
predicting academic success. The correla- 
tion between SSHA score and later grades 


the country: William H. Dibrell, Fran 

ton, and Glenn Campbell, San eet Be 
Vernon Zunker, Sequin, Texas; James Williams, 
Glen Ellyn, Illinois; Dale Jackson, Austin, Texas; 
Catherine L. Beachley, Hagerstown, Maryland; 
Walter Bergmann, St. Louis, Missouri: Neids 
Clark, Durango, Colorado; Herbert Kaczmarek. 
Gunnison, Colorado; and Le Nora L. Losee, Salt 
Lake City, Utah, Data analysis was carried out 
with the assistance of Donald Witzke, Sara Currie 
and Luis Laosa. Portions of this paper have been 
adapted from the Survey of Study Habits and 
Altitudes Manual, Forms C and H, written by the 
same authors and published by the Psychological 
Corporation, New York. 


ranged from .27 to .66 for men and from 
.26 to .65 for women. Consistently low 
correlations between the SSHA and scho- 
lastic aptitude made it possible to increase 
appreciably the prediction of grades by 
combining both scores (Brown & Holtz 
man, 1955; Holtzman, Brown, & Farqu: 
har, 1954). 

Successful application of the SSHA toa 
variety of problems ranging from identify- 
ing students who needed counseling to 1¢ 
search on achievement motivation, creat 
a demand for a similar instrument which 
could be given to children in junior al 
senior high schools. The Criticism of Edw 
cation scale given to over 13,000 Mi 
school students in 1956 as part of “ 
Texas Cooperative Youth Study OT 
Holtzman, 1965) consisted of items like 
those in the SSHA dealing with schol 
motivation. A year later, an experimen 
version of the SSHA containing items a 
able for junior high school was used “4k 
cessfully in a research program ine 
1,470 seventh graders in four small on 
(McGuire, Hindsman, King, & Jennitt : 
1963). Since revision of the SSHA bert 
teady underway to improve the qué oe 
naire’s usefulness for counseling a 
the possibility of a parallel dave fi 
to produce two forms, one for Grades an 
and the other for Grades 7-12, was cf A fot 
larly attractive. Form C of the SS 
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use at the college level has been described 
elsewhere (Brown & Holtzman, 1966). The 
present paper deals with the development 
of Form H for Grades 7-12 and research 
on its reliability and validity. 


Development of the SSHA, Form H 


Preliminary stages in the development of 
Form C for use with college students were 
completed before work on the high school 
version was begun. In addition to 70 items 
from the original SSHA, Form C contained 
380 new statements which dealt largely 
with attitudes toward education and to- 
ward teachers. Enough progress had been 
made on the college revision to simplify 
development of Form H for use in junior 
and senior high schools. Operating inde- 
pendently, two committees of teachers 
undertook to revise items in Form © so 
they would conform with instructional pro- 
cedures, academic requirements, and study 
conditions typical of Grades 7-12. A stu- 
dent, committee then rephrased some of the 
items to eliminate wording likely to prove 
confusing to young teenagers. Of the 100 
statements in Form ©, only 17 had to be 
modified appreciably to maintain the same 
basic meaning. 

The resulting version of the SSHA was 
then administered to all students in Grades 
7-12 in San Marcos, Texas. Very few stu- 
dents had any difficulty in understanding 
and responding to the reworded state- 
ments, Similar results were obtained for 
Grades 7-9 in Livonia, Michigan by Morris 
(1961) who found a substantial correla- 
tion between SSHA scores and teachers’ 
tatings of academic performance. 

The 100 items in Form H were assigned 
fo one of four subscales in exactly the 
Hine manner as they were classified in 
the revised college version, Form C (Brown 

oltzman, 1966), to preserve the paral- 

el ature of the two forms. The original 
‘vation of these four scales for Form C 
iolved a priori classifications by 15 inde- 
Pendent judges and item-subscale inter- 
Correlations on a sample of 568 college 
jovnmen. Containing 25 items each, the 
°ur subscales are Delay Avoidance, Work 
ethods, Teacher Approval, and Education 
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Acceptance. Scores on the first two sub- 
scales can be combined to give the Study 
Habits score, and scores on the last two 
subscales yield the Study Attitude score 
when added together. The Study Orienta- 
tion score is obtained by combining all 
four subscales. 

As in the original SSHA, each item 
receives a weight of 0, 1, or 2, depending 
upon the empirically determined cutting 
points on the five-choice response contin- 
uum of each item. Since each of the four 
basic scales has a maximum raw score of 
50, the maximum score on Study Habits 
and Study Attitude is 100 and the highest 
possible score for Study Orientation is 200, 
In the final published version of Form H, 
either IBM 805 or IBM 1230 answer sheets 
are available for machine scoring; special 
stencils are also provided for hand scoring 
of IBM 805 or 1230 answer sheets if desired 
(Brown & Holtzman, 1967). The Diag- 
nostic Profile on the reverse side of the 
answer sheet facilitates interpretation and 
gives the counselor a convenient graphic 
record. 


Reliability of Form H 

Extensive studies of internal consistency 
and test-retest reliability for the four basic 
SSHA subscales as well as the Study Orien- 
tation score were first carried out for Form 
C, the college version of the SSHA. Using 
the Kuder-Richardson Formula 8 (Kuder & 
Richardson, 1937) for estimating test re- 
liability from the variance of total scores 
and the sum of the item variances, coeffi- 
cients ranging .87-.89 were obtained for 
the four basic subscales. A test-retest study 
over a 4-week interval with 144 freshmen 
who were given Form C yielded reliability 
coefficients of .93, .91, .88, and .90, respec- 
tively, for the Delay Avoidance, Work 
Methods, Teacher Approval, and Educa- 
tion Acceptance scales. The corresponding 
coefficients for a sample of 51 freshmen 
with a 14-week interval were .88, .86, .83, 
and .85, respectively. Since these results 
cannot be generalized with certainty to 
Form H, the high school edition, in spite of 
the high degree of similarity in the two 
forms, one additional study was conducted 
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using Form H with junior high school stu- 
dents. 

A sample of 237 ninth graders in San 
Marcos High School was given the SSHA 
twice, with an interval of 4 weeks between 
sessions. The test-retest reliability coeffi- 
cients were .95, .93, .93, and .94, respec- 
tively, for the Delay Avoidance, Work 
Methods, Teacher Approval, and Educa- 
tion Acceptance scales. The means and 
standard deviations remained essentially 
unchanged over the 4-week period. 

The stability of Form H scores for ninth 
graders compares favorably with the sta- 
bility of Form C scores for college fresh- 
men. These studies indicate that the four 
subseales are sufficiently stable through 
time to justify their use in predicting fu- 
ture behavior or in assessing the degree of 
change in study habits and attitudes after 
counseling. 


Standardization and Validation 


Preliminary standardization of the 
SSHA, Form H, was carried out in the fall 
semester, 1964, on 3,731 students in Grades 
7-12 in 10 junior and senior high schools 
located in small towns throughout central 
Texas. Correlations between the Study 
Orientation score and grade-point averages 
at the end of the year ranged from .31-.85, 
sufficiently high to justify a standardization 
program on a national scale. 

Arrangements were made to collect data 
during the fall semester, 1965, from stu- 
dents in Grades 7-12 in Austin, Texas; 
Durango, Colorado; Glen Ellyn, Illinois; 
Gunnison, Colorado; Hagerstown, Mary- 
land; Salt Lake City, Utah; and St. Louis, 
Missouri. Scores on an acceptable scholastic 
aptitude test and subsequent grade-point 
averages for the fall semester were ob- 
tained for nearly all of the students tested. 
When added to the central Texas samples 
studied the year before, the new SSHA 
protocols yielded a total of 11,218 students 
upon which percentile norms have been 
developed—5,425 cases for junior high 

school norms (Grades 7-9) and 5,798 cases 
for senior high school norms, Complete 
data, consisting of the SSHA, a scholastic 
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aptitude test, and a grade-point average, 
were available for 10,888 students. 

Since each school system had its own 
policies concerning the kind of aptitude 
test to be generally administered, either 
percentile scores or IQ equivalents were 
used in the statistical analysis. The scho- 
lastic aptitude tests included the Coopera- 
tive School and College Ability Tests, 
the Differential Aptitude Tests (Form L), 
the Henmon-Nelson Tests of Mental Abil- 
ity, the Iowa Tests of Educational Develop- 
ment, the Lorge-Thorndike Intelligence 
Tests, the Otis Quick-Scoring Mental Abil- — 
ity Tests (Beta Test, Form CM, and 
Gamma Test, Form AM), the Pintner 
General Ability Tests, and the Preliminary 
Scholastic Aptitude Test. 

The criterion of school achievement, 
grade-point average for the fall semester, 
was generally obtained by assigning weights 
of 4, 3, 2, 1, and 0 to grades of A, B, C, D, 
and F, respectively. Only courses in the 
so-called “solids,” that is, mathematics, 
Science, social studies, foreign language, 
and English, were used in computing grade- 
point averages. 

The intercorrelations of the seven SSHA 
scores, the scholastic aptitude test scot, 
and the grade-point average were com- 
puted separately for each grade and school 
system. The detailed results of this analysis 
are given in the Survey of Study Habis 
and Attitudes Manual, Forms C and H 
(Brown & Holtzman, 1967). Only a sum 
mary of the findings can be presented here. 

All 49 of the correlations between the 
total SSHA score, Study Orientation, and 
grade-point average proved to be highly | 
significant. The individual validity coef 
cients ranged from .32 to .66; the average 
correlation based on all 10,888 cases Wés 
49, Bake 
Correlations between the scholastic ee 
tude test score and grade-point averaé 
were only slightly higher, ranging J bet 
with an average of .57. The correlation ‘ 
tween the SSHA and the scholastic aptitt 
test was generally low, ranging 
with a mean of .27, suggesting 
SSHA measures important traits 
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TABLE 1 
Mean CorrELation or Suryey or Srupy Haszrrs ann ArmrrupEs ‘SSHA 4 
Scuo.astic AprituTupE (SA); anp GrapE-Porn? AVERAGE Fes Sprain vk 
with Muttipty (R) anp Parrran (r) Conretations or Scorzs 
with GRaDE-PoInt AVERAGE 
SSHA (1) | SSHA (1) | SA test 
Grad N ith | wi feted 
mde GiA'ty | Tata) | GG | Me | om | Gute | oxenttion 
7 1,684 -55 «32 61 72 47 106.5 33.1 
8 1,628 +52 -29 -59 -69 45 107.0 82.8 
9 2,005 49 +29 52 63 -41 101.6 31.5 
10 2,064 -49 29 -62 -70 Al 97.9 31.6 
ll 1,840 47 24 57 67 42 97.7 80.9 
12 1,667 46 +20 53 66 43 102.9 30.3 
Total 10,888 -49 27 57 -67 43 


Note.—Mean correlations were obtained by converting each r into its Fisher’s z function, weighting 
by the appropriate number of cases, averaging the values, then reconverting. Multiple and partial 


Coefficients were derived from the weighted averages. 


to school achievement that are untouched 
by the standard scholastic aptitude tests. 
Very similar findings were obtained among 
college students with both Form © and 
the original version of the SSHA. 

Further insight into the nature of the 
relationship between the study habits and 
attitudes, scholastic aptitude, and school 
achievement can be gained by inspection of 
the multiple and partial correlations us- 
mg grade-point average as a criterion. 
These statistics, together with the mean 
Zeto-order correlations among the three 
variables, are presented separately for each 
grade in Table 1. It is apparent in every 
tase that both the SSHA and the scholastic 
‘ptitude tests contribute in significant and 
'stinetly different ways to the successful 
Prediction of actual school achievement. 
‘he multiple correlation using both pre- 
dictors ig appreciably higher than the 
Correlation of either one alone with the 
niterion. Because of the low intercorrela- 
tion between the SSHA and. scholastic 
ptitude, the correlation between the SSHA 
and grade-point average with scholastic 
plitude partialed out is still quite high, 
Tanging 4147 across the six grade levels. 
d Table 1 also contains the mean and stan- 
'td deviation of the Study Orientation 
ee for each of the school grades. The 
“ght but regular drop in variance with in- 


creasing grade level, coupled with a similar 
drop in the correlation between SSHA and 
grade-point average, is probably due to 
gradual loss of extremely poor students 
who drop out of high school before finish- 
ing. Reasons for the minor fluctuation in 
mean scores, however, are unknown. 

Unlike the original version of the SSHA, 
both Form C and Form H consist of four 
basic subscales, each containing 25 items 
clustered together in a scale because of 
commonly shared content. It is of some 
interest to examine the intercorrelations of 
these four scales as well as their relative 
value in predicting the criterion of school 


TABLE 2 
AVERAGE INTERCORRELATION ComFFIcIEN's, 
Maans, AND StanDARD Deviations or Basic 
Scorzs on THE Survey or Srupy Hanirs 
anp AriiTupEs, Form H 


Scale DA WM TA EA 
Delay Avoidance (DA) -70| 51} .65 
Work Methods (WM) -56 | 65 
Teacher Approval i 
(TA). 7 
Education Acceptance 
(EA) 
M 21.9 | 22.8 | 28.5 | 27.5 
SD 9.7| 9.2] 10.1] 8.7 


Note—N = 11,218. 
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achievement. Table 2 contains the mean 
jntercorrelations of the four basic scales. 
These correlation coefficients were obtained 
by converting each entry in a table of cor- 
relations for each of the grades at each par- 
ticipating school system to its Fisher’s 
z function, weighting each by its appro- 
priate number of cases, averaging the val- 
ues, and reconverting. The values as pre- 
sented are based on all 11,218 cases. 

Intercorrelations among the four sub- 
scales are moderately high and positive, 
suggesting that one major dimension run- 
ning through all four scales is sufficient to 
account for most of the variance. The 
Study Orientation score obtained by sum- 
ming the raw scores on the four basic scales 
is the best single measure of this dimen- 
sion. It should be noted, however, that the 
highest correlations, .70 and .75, occur 
between the pairs of scales making up the 
derived scales of Study Habits and Study 
Attitudes, respectively. Given the large 
sample size and resulting high stability of 
the obtained correlations, it can be con- 
cluded that Delay Avoidance and Work 
Methods have more in common with each 
other than either has with the remaining 
two scales. The same can be said for 
Teacher Approval and Education Accept- 
ance. This finding provides some empirical 
justification for the derived scales based 
on these two pairs of subscales, 

Correlations between each of the four 


TABLE 3 
Survey or Srupy Hasrrs anp ArrirupEs (Form 
H) Susscaue Corretarions wire Grapz- 
Pornr Average (GPA) anp ScHonastic 


AptirupE (SA) 
a 
Seale ‘ceAY | sh"aths 
Delay Avoidance -41 16 
Work Methods AT Joy 
Teacher Approval +85 +29 
Education Acceptance 48 .26 


ss 
Note.—Subscale correlations were obtained by 
converting each r into its Fisher’s z function 
weighting by the appropriate number of cases, 
averaging the values, then reconverting. ; 
«N = 10,888. 
DN = 7,157. 
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basic subscales, grade-point average, and 
the scholastic aptitude measures are given 
in Table 3. Delay Avoidance shows the 
least amount of overlap with scholastic 
aptitude (r = .16) while Work Methods 
shows the most (r = .37). Validity coefi- 
cients for the four scales are only slightly 
lower than the comparable coefficient for 
the total SSHA score, Study Orientation, 
The values of .47 and .48 for Work Meth- 
ods and Education Acceptance are almost 
identical with the correlation of .49 for 
Study Orientation. 


Discussion 

The above results clearly demonstrate 
the reliability and validity of the SSHA, 
Form H, when extended downward as far 
as the seventh grade. Correlations between 
the high school version of the SSHA and 
subsequent academic grades are even higher 
than those generally obtained for the col- 
lege edition, Form C. The average correla- 
tion between SSHA, Form C, and grades 
for freshmen in six different colleges was 
.36 (Brown & Holtzman, 1967), as com- 
pared to .49 for SSHA, Form H, and high 
school students. Undoubtedly, much of this 
difference is due to the greater heterogeneity 
of the high school students. The mean score 
on Study Orientation for Form H is 1003, 
as contrasted to a mean of 114.2 for Form 
C, indicating that the greater heterogeneity 
is due largely to a higher proportion 0 
low scores among high school students than 
among college freshmen. 
Granted that the SSHA is a useful instr. 
ment for evaluating the study habits and 
motivation of students in Grades 7-12 & 
well as in college, to what extent can the 
habits and attitudes of such students be 
improved by special training or counset 
ing? A recent experiment by Haslam oa 
Brown (1968) indicates that substantia 
improvement can be obtained from appt 
priate study-skills instruction—impror 
ment not only in SSHA scores but &® 
in subsequent academic grades, aS oe 
pared to matched control cases. Clea 
the junior or senior high school offers 3 
even more meaningful opportunity for sys 
tematic efforts aimed at improving st! 


= ae 
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habits, attitudes, and motivation than does 
the freshman year in college. 

In spite of the substantial overlap in 
meaning among scores from the four basic 
§SHA subscales, the use of subscales has 
real value in individual counseling. Be- 
cause it is hard to get across the content of 
the survey in a simple dramatic manner 


| using the original SSHA single score, the 


use of subscale scores rather than individual 
items makes it possible for a counselor to 
stress four somewhat different areas rather 
than a few individual items. In this way 
the student can profit most from such 


| counseling by being able to remember where 


his difficulties lie. In addition, individuals 
who wish to engage in evaluative research 
may more profitably use the four relatively 
homogeneous subscale scores rather than 
either single items or only one general 
score. 
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EFFECTS OF INCIDENTAL CUES AND ENCODING 
STRATEGIES ON PAIRED-ASSOCIATE LEARNING ; 


University of Delaware 


Several studies have demonstrated that picture stimuli relative to 
word stimuli facilitate paired-associate learning. To determine 
whether incidental cues in pictures, or possibly a different encoding 
strategy elicited by pictures, produce this learning difference, 6 treat- 


ments, involving 72 


: 
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Ss, were devised. When incidental cues in pictures 
were varied by presenting Ss with several pictures representing the 


same concept, pictures unexpectedly produced faster learning than 
words (p < .001). Adding incidental cues to word stimuli did not in- , 
crease learning efficiency relative to normally presented words. More- 
over, requiring Ss to label the stimuli, pictures or words, failed to pro- 
duce a significant difference in performance from that of Ss who 
learned under standard conditions. Finally, superiority of picture 
stimuli over word stimuli was replicated (p < .01). . 


Several investigators (Deno, 1968; Lums- 
daine, 1949; Pavio & Yarmey, 1966) have 
demonstrated that picture stimuli, com- 
pared to word stimuli, facilitate paired- 
associate (PA) learning. Others (Jenkins, 
Neale, & Deno, 1967) who measured recog- 
nition memory for both media found that 
pictures are more easily recognized than 
words. The present study attempts to de- 
termine the characteristics of pictorial 
representation that produce the observed 
differences in learning and memory. 

Two alternative explanations are pro- 
posed to account for the effectiveness of 
pictorial stimuli in learning. An incidental 
cue explanation postulates a difference in 
the degree of perceptual richness for words 
and pictures. In the typical PA experi- 
ment, words and pictures are chosen so 
that the picture and word unequivocally 
represent the same concept. Generally, 
this is accomplished by choosing pictures 
that are given a common verbal label by 


*This research was supported its 
University of Minnesota, tee “deli 
Human Learning, from the National Science 
Foundation (08541), the National Institute of 
Child Health and Human Development (5-P01- 
HD-01136-03), and the Graduate School of the 
University of Minnesota. The data on which this 
paper is based were included in the author's dis- 
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nearly all Ss. This procedure results in i 
set of pictures that are presumably equiva 
lent to a corresponding set of words in the 
extent to which they elicit some common} 
verbal response. An obvious fact, howevet, 
is that the stimuli, pictures and word, 
contain cues that are additional or iér 
dental in the sense that these cues may 
be altered without changing the tendenty 
of Ss to name the stimulus appropriately. 
A cue is therefore incidental if changin: 
it does not reduce the probability that & 
will give the intended label. For example 
incidental cues in the picture stimulus | 
BOY, may be distinctive clothing, a fat 
expression, or books the boy is cartymé 
On the other hand, incidental cues, ta 
though present in the word BOY, may 
less conspicuous, such as pica type 
upper-case letters, and context. Moreovel 
the fact that other word stimuli may shat 
a number of these same attributes increases | 
the similarity among the stimuli. a 
It may be, then, that incidental oF 
produce the picture-word difference | 
learning by presenting additional i 
(functional stimuli) to which lea 
can attach the to-be-learned respons® 7 
making the stimuli physically a 
similar and thus more discriminable, wi 
changing the stimulus on an abstra 
dimension. : 5 
Another explanation investigated ng 
study involves a difference wo a 
processes for the two types 0 
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Learners may have a weaker tendency 
to represent or encode a stimulus verbally 
when it is a picture rather than a word 
(Lumsdaine, 1949). Bousfield (1961) con- 
tended that associative interference oc- 
curs in learning as a result of similarity 
among representational responses. Deno 
(1968) recently reported that the largest 
picture-word differences in learning occur 
with conceptually similar stimuli, that is, 
stimuli which presumably elicit associ- 
atively similar representational responses. 
Deno’s study suggests that by represent- 
ing the stimulus picture nonyerbally, the 
learner reduces associative interference 
that he would otherwise experience had he 
represented the stimulus verbally. 

The present study seeks to assess the 
contribution of incidental cues to efficiency 
in PA learning by maximizing or mini- 
mizing their presence and noting the effect 
on learning. This was attempted by pre- 
senting Ss with several different pictures 
representing the same concept (washing 
out incidental cues) or by presenting word 
stimuli to which incidental cues, such as, 
color, size, and unique lettering, were 
added (enriching words). In addition, the 
study was designed to bear upon the propo- 
sition that picture-word learning differ- 
ehees are produced by different encoding 
strategies. It was hypothesized that in- 
creasing the similarity of the strategies 
Used to encode words and pictures in- 
creases similarity in performance. This 
Was tested by noting if the requirements 
to label all stimuli overtly produced a 
decrement in performance, particularly 
When the stimuli were pictures. 


Merxop 
Subjects 


Subjects for this experiment were 72 students 
“rolled in introductory psychology classes at the 
galversity of Minnesota. Each student’s participa- 
‘on was voluntary but was also rewarded with 
abs credit. 
. Several restrictions were placed on the selec- 
tion of $s, In an effort to Garena the variability 
‘mong Ss, only female Ss were allowed to partici- 
bate. Secondly, to insure that all Ss possessed ap- 
hoximately the same language habits, only those 

© spoke English natively were allowed to 
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participate. Finally, since Japanese words were 
employed as responses, no S could participate who 
had previously studied an oriental language. 
Twelve Ss were assigned randomly to each of six 
experimental groups. 


Materials 


The Ss learned a PA list consisting of 12 stimu- 
lus-response pairs. The stimuli were either words, 
words with incidental cues added, or pictures repre- 
senting 12 common objects. Responses were 12 
Japanese words. 

The 12 stimuli were conceptually similar, that 
is, there were three instances from each of four 
conceptual categories—animal, clothing, furniture, 
and people, Each category instance was maximally 
separated from other instances in the same cate- 
gory. Such a list was chosen on the basis of Deno’s 
(1968) finding that conceptually similar, maxi- 
mally separated stimuli produce the greatest 
picture-word difference in learning, since it is this 
difference that the current study seeks to investi- 
gate. The following stimulus concepts and re- 
sponses were used. 


Stimulus Response 
TABLE ————————. ATSUI 
MAN —_—_—_————— HUNE 
HAT —____————. RIKO 
DOG —_——_——_——— KARAI 
CHAIR }9————————— BAKA 
BOY —_—_——_———— AMAT 
coAT ———————— HIKUl 
CAT —__—__————. HAYAI 
ee aen 3 
GIRL 
TIE OCHIKAL 
MOUSE TOOI 


The same 12 Japanese words were used ag re- 
sponses regardless of the experimental condition 
to allow for controlled comparisons among condi- 
tions. In addition, two random compositions of 
stimuli and responses were used 80 that one-half 
of the Ss in each experimental condition received 
a different random order. In the two random pair- 
ings stimuli were never paired with the same re- 
sponse. ; 

PThree random orders of stimuli were obtained 
to guard against sequential response learning. 
This was accomplished, first, by ordering the con- 
cepts within a list and, second, by assigning the in- 
stances within each category. For a given 8 the 
same stimulus concept was always paired with the 
same response term. 


Apparatus 
i imuli hoto- 
The picture stimuli and responses were PI 
graphed on 35mm. black-and-white film, and nega- 
tives were slide-mounted. A typical 2 second-2 
second PA anticipation method was employed. 
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Procedure 

All Ss participated individually. Each 8, upon 
arrival at the laboratory, was randomly assigned 
to one of the following six experimental conditions. 

Word-Normal (WN). The Ss learned verbal 
responses to word stimuli in normal orthography 
(typewriter capitals), A sample list of this condi- 
tion is given above. 

Picture-Normal (PN). The Ss learned verbal 
responses to pictorial stimuli. The same picture 
was always used as a particular stimulus. 

Word-Enriched (WE). The Ss were shown 
“perceptually enriched” words, that is, the stimulus 
words were reproduced with distinctive lettering, 
colors, and slants, Each stimulus was enhanced by 
the addition of incidental cues. 

Picture-Washout (PWa), The Ss learned re- 
sponses to picture stimuli, but the same picture 
never occurred on two successive trials, For exam- 
ple, if on the first trial a picture of a boy was 
paired with the Japanese word KARAI, on the 
second, third, fourth, and fifth trials a different 
picture of a boy was paired with the Japanese 
word KARAI. Each of the 12 stimuli was repre- 
sented by five different pictures. All the pictures 
were highly labelable. 

Word-Label (WL). The Ss were required to 
label the stimulus word overtly when it appeared. 
Word stimuli were in normal form. 

Picture-Label (PL), The Ss were instructed 
to label each picture overtly as it appeared. Picture 
stimuli were in normal form. 

After E described the learning task to each S, 
he proceeded to give one familiarization trial with 
the stimuli and three familiarization trials with 
the responses. The familiarization procedure was 
intended to reduce variability in performance 
that is ordinarily attributed to response learning. 
Since different groups required longer to learn, the 
sum of correct responses over the first seven trials, 
rather than the total number of correct responses, 
was used to measure group performance. 


Resvirs 

The performance and tests of signifi- 
cance are presented in Table 1. 

The dependent variable, the mean num- 
ber of correct responses, was employed in 
five prior comparisons that follow. To test 
the incidental cue hypothesis, performance 
of Group WN was tested, first, against 
PWa and, second, against WE. 

The results of the first comparison fail 
to support the incidental cue hypothesis, 
Not only did washing out incidental cues 
fail to produce comparable performance 
between Groups PWa and WN (¢ = 
4.53, p < .001), but also, conversely, 
Group PWa seemed to have learned as 
easily as group PN. 
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TABLE 1 
Means anp Sranparp Deviations on tHE Num 
BERS OF Correct ResPonsns AND PERFORMANCE 
ComPaRISONS FOR THE TREATMENT Groups 


Group M SD 
Picture stimuli 
Picture-Washout (PWa) 51.0 9.75 
Picture-Normal (PN) 43.83 | 7.69 
Picture-Label (PL) 38.42 | 15.58 
Word stimuli 
Word-Enriched (WI) 84.25 | 15.18 
Word-Label (WL) 34.17 | 10.65 
Word-Normal (WN) 25.92 | 16.54 
Comparison ¢ ratio 
PWa versus WN 4.53%" 
WE versus WN 1.29 ns 
PN + PL versus WN + WL 2.92* 
PL + WL versus PN + WN 0.38 ns 
PN — WN versus PL — WL 1.80 ns 
*p < 01. 
“op < 001. 


The second comparison, similarily, fails 
to support the incidental cue hypothesis, 
in that enriching words did not signifi- 
cantly facilitate learning relative to nor- 
mally written words (t = 1.29, ns). 

The third comparison of the rate of 
learning? with picture stimuli as opposed to 
learning with word stimuli supports the 
common finding that picture stimuli facili- 
tate learning relative to word stimuli (¢ = 
2.92, p < .01). 

The fourth comparison tests the effect on 
learning of requiring Ss to label both 
picture and word stimuli. Results suggest 
that Ss who label the stimuli overtly do 
not perform with significantly different 
success (f = .38, ns) than Ss who are nO 
required to label the stimuli. _ 4 

Finally, the interaction of stimulus mode 
with a requirement to label, the fifth eae 
parison, is not statistically significam 
(t = 1.8, ns). 


Discussion 


The four conditions involved in the Ly 
of the incidental cue hypothesis are ©” 


i- 

* Note that the third, fourth, and fifth Gr i 
sons constitute the treatment and interac 
fects of a normal 2 X 2 analysis of variance. 
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WN, PWa, and WE. The hypothesis states 
that differences in perceptual richness be- 
tween words and pictures produce the ob- 
served differences in learning. Therefore, 
raising the salience of incidental cues by 
embellishing the words should facilitate 
PA learning relative to normally written 
words. Also, washing out incidental cues 
in pictures by varying the pictures on 
acquisition trials should produce perform- 
ance comparable to that produced by nor- 
mally presented words. Contrary to predic- 
tions made on the basis of the incidental 
cue hypothesis, embellishment of word 
stimuli did not facilitate learning relative 
to normally produced words; and mini- 
mizing incidental cues in picture stimuli 
failed to reduce learning efficiency to a 
level comparable to normally presented 
words. Furthermore, an explanation for 
picture superiority in PA learning ap- 
parently does not involve a difference in 
the linguistic encoding of the stimuli, 
since requiring learners to label the stimuli 
did not significantly affect performance. 

A psychological explanation for the dif- 
ferential effectiveness of word and picture 
stimuli in PA learning is not immediately 
apparent. Perhaps an alternative explana- 
tion along the lines of Kagan’s (1967) 
attentional determinants of learning is 
worthy of investigation. Kagan presented 
evidence to suggest that the attention 
given to and the reinforcement value of 
creating schema (the psychological repre- 
Sentation of an external pattern) is in- 
versely related to the predictability of 
the external pattern. With this considera- 
tion, pictorial representations typically 
‘mployed in comparisons of stimulus mode 
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may very likely be expected to elicit 
greater attention from the learner, be- 
cause they fail to correspond with his own 
schematic representation of the concept. 
This would certainly be the case under 
conditions where a single concept is 
represented by a series of different pic- 
tures (as in PWa). On the other hand, a 
single word, regardless of its presentation, 
should not influence the natural schematic 
representation of the concept. 

Clearly, simplistic notions involving in- 
cidental cues and labeling strategies must 
be reexamined. The results of the present 
investigation suggest that a learner pre- 
sented with a picture stimulus adopts as 
his functional stimulus some aspect of the 
stimulus configuration, perhaps a schema, 
the efficacy of which is independent of the 
verbal label for that stimulus and of in- 
cidental cues present in the stimulus. 
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CLASSROOM CLIMATE AND INDIVIDUAL LEARNING 


HERBERT J. WALBERG anp GARY J. ANDERSON 
Harvard University 


The research is one of a series testing the Getzels-Thelen theory of 
the classroom as a social system. From a new measure of student 
perception of classroom climate, 18 subscores were obtained in 
November from 76 classrooms throughout the United States and 
used to predict 9 congitive, affective, and behavioral measures of 
learning at the end of the school year (regression-adjusted for initial 
differences). More than 4 times as many correlations as the chance 
expectancy were significant (p < .05). Among the structural and 


morphism,” “ 


affective climate measures, variables grouped under the rubrics “iso- 


ization,” and “synergism” predicted learning 


variables more frequently than “coaction” and “syntality.” 


‘Two recent studies have shown that 
scores obtained on a measure of the 
socioemotional climate of the classroom 
(Walberg, 1966) can be predicted from 
earlier measures of (a) teacher personality 
(Walberg, 1968a) and (b) student ability 
and interest in the subject (Walberg & 
Anderson, 1968). Yet this work is incom- 
plete in that it does not demonstrate that 
the student’s individual satisfaction with 
the climate of the class makes for learn- 
ing, the criterion of institutional effective- 
ness espoused by school boards, parents, 
administrators, and teachers. The intent of 
the present research is to investigate this 
crucial relationship and to explore em- 
pirically further hypotheses derived from 
a sociopsychological theory of the class- 
room as a social system (Getzels & Thelen, 
1960). 

Getzels and Thelen make an analytic 
distinction between institutional role ex- 
pectations and individual personality dis- 
positions, which both bear upon the climate 
of the class. The consellation of role 
expectations can be termed the “structural” 
dimension (Walberg, 1968b); it refers 
to the structure or organization of stu- 


* This research is part of the evaluation of 
Harvard Project Physics, a course-development 
project supported by the Carnegie Corporation of 
New York, the National Science Foundation, the 
Sloan Foundation, and the United States Office of 
Education. The authors thank Fletcher G. Watson 
and Wayne W. Welch for comments on a draft of 
the manuscript and Mary Hyde and Arthur Roth- 
man for computer consultation and special pro- 
graming. The second author is now at McGill 
University. 


dent roles within the class, for example, 
such things as goal direction and demo- 
cratic policy. The structural dimension 
applies to shared, group-sanctioned class- 
room behavior, while the “affective” di- 
mension pertains to idiosyncratic personal 
dispositions to act in a given way to satisfy 
individual personality needs. Aspects of 
the affective dimension are such things 
as satisfaction, intimacy, and friction in 
the class. A recent multivariate study, in 
the same theoretical vein, of 72 classrooms 
showed that student perceptions of the 
structural and affective aspects of soci0- 
emotional climate are strongly related 
(canonical correlations as high as 8). 
And although the patterns of correlation 
are complex, they are interpretable in 
terms of the Getzels-Thelen conceptual 
scheme and certain other sociopsychologi- 
cal theories (Walberg, 1968b). The present 
study fits into the series as follows: 

Output 


Input ‘Throughput 
ai May posttests 


September pretests December midtest 


and 
cognitive measures 


on Classroom climate Student learning 


Affective 


Teachers—————> (Structure-—————> {ie 
vioral 
Students————-> | Affective--——-—— (Behavior®! 


The solid lines refer to relationships e 
have already been established in ae 
work; the broken lines refer to the be) ae 
of this study: the examination aie 
hypothesis that individual student < a 
ment and interest in the subject at ; v on 
of the school year can be predicte 
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structural and affective aspects of class- 
room climate measured at midyear. 

Previous studies were of the class as a 
whole and, hence, the units of analysis were 
the means of the student measures within 
each class. It has been shown that the cor- 
relation of means of subgroups and the 
correlation of individuals within the same 
sample can differ in sign and magnitude 
(Robinson, 1950). Hence, a parallel means 
analysis is in progress to complete the 
series above, but the focus of this study is 
the individual. It seeks to determine the 
learning of individuals with different per- 
ceptions of classroom climate rather than 
the mean perception of entire classes. 


Mernop 


Subjects and Instruments 


Some 2100 high school juniors and seniors in 76 
Classes throughout the country participated in the 
Preliminary evaluation of Harvard Project Physics, 
4n experimenta) course using a variety of new in- 
Structional media and emphasizing the philo- 
Sophical, historical, and humanistic aspects of 
_ Physics. The mean Henmon-Nelson IQ of a ran- 
dom sample of the group is 115. Their scores on 
five instruments constitute the data for analysis. 

The battery of cognitive, affective, and be- 
havioral criterion measures includes the Physics 
Achievement Test, the Science Process Inventory, 
the Semantic Differential for Science Students, and 
the Pupil Activity Inventory. The Physics Achieve- 
ment Test (Ahlgren, Walberg, & Welch, unpub- 
lished, 1966) is a 36-item multiple-choice test de- 
Signed to measure general knowledge of physics. 
It has a Kuder-Richardson formula 20 reliability 
(Guilford, 1954) of .76 based on a random sample 
of 400 high school students at the end of their 
Physics course. The Science Process Inventory 
(Welch & Pella, 1967) consists of 100 true-false 
Statements describing the assumptions, activities, 
Peduota, and ethics of science. The test was vali- 
iraese on a sample of eminent scientists and has a 

uder-Richardson formula 20 reliability of 86. 

The Semantic Differential is familiar to many 
Tesearchers and has been described elsewhere (Os- 
Sood, Suci, & Tannenbaum, 1957; Walberg & 

derson, 1968). Six clusters reflecting affective 
ecttives of Harvard Project Physics were se- 
aan for analysis. Using the Spearman-Brown 
tena to correct, the mean item-intercorrelations 

or the number of items (Guilford, 1954) yields 
teliabilities of about 8. (See Table 1 for relia- 
ilities of all scales and tests discussed here.) 

The Pupil Activity Inventory was described by 
% Oley and Reed (1961). It consists of a number 

f adolescent science activities, and the student is 
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asked to indicate the frequency of his participa- 
tion in each. Walberg (1967) re-factor analyzed 
the instrument for the present sample and found 
five dimensions: Academic, Biological, Tinkering, 
Cosmology, and Applied Life. The Academic, 
Tinkering, and Cosmology cluster scores were 
summed for a physics activity score which yields a 
3 ya Brown corrected internal consistency 
of .76. 

The first form of the Classroom Climate Ques- 
tionnaire (Walberg, 1966) consists of 80 items 
describing characteristics of school classes, for 
example, “The class members are working toward 
many different goals.” The respondent expresses 
agreement or disagreement with each on a 4-point 
scale. The instrument yields 18 factor-analytically 
derived cluster scores which, for individuals, range 
in corrected split-half reliability from 41 to 86 
(See Table 1 and Walberg & Anderson, 1968). A 
revised instrument with more items and, hopefully, 
greater reliability is being used to replicate the 
work this year. 


Procedure 

The data were obtained using a randomized 
data-collection system within each class which 
tends to minimize individual testing time and 
maximize the number of tests which can be ad- 
ministered (Walberg & Welch, 1967). The system 
is most appropriate for class means analysis, but 
it does provide patterns of scores for studies of 
individual students as well, with certain restric- 
tions. Random halves of the group of students 
took the criterion measures at the beginning and 
at the end of the year; a random fourth took the 
Classroom Climate Questionnaire at midyear, The 
sampling fraction for any combination of tests is 
the product of the sampling fractions for the 
combination. Thus, for the midtest and any post- 
test, a fourth times a half or an eighth took both 
measures. To bring pretests into the analysis, the 
eighth must be multiplied by a half, giving one- 
sixteenth, Thus, for a total of 1700 students who 
finished the course, about 214 took the midtest and 
a given posttest; and 106 took the same pretest 
and posttest, as well as midtest. Actually, be- 
cause of absentees and unusable answer sheets, the 
figure is about 85 for any given combination of 
pre-, mid-, and posttest. 

From the group of 25 subscores on the tests 
given at the beginning and at the end of the 
course, 9 were selected as criteria for measuring 
student learning, since they measure cognitive, 
affective, and behavioral course objectives. They 
are: physics achievement; science understanding ; 
six semantic differential measures; and physics 
activities, which is the sum of the Academic 
Science, Cosmology, and Tinkering scales on the 
Pupil Activity Inventory. The reliabilities of the 
scales are shown in Table 1. Using a method de- 
scribed by Ferguson (1959), regression-adjusted 
gains (“delta”) scores (the posttests’ standard- 
ized deviations from predicted scores based on the 


Hersert J, WALBERG AND Gary ANDERSON 


416 
TABLE 1 
CorrELations oF CLAssrooM-CLIMATE AND SrupENnT-LEARNING Muasurzs 
Cognitive Affective® Behavioral 
i . Laboratory Universe Physics s 
Classroom climate Physics ae Physics 
Pgh standing? Fun [Beauti-leriendiy| Inte nu Impor activities 
a | 6) | a) | om | 6) | oy [ex] @ | ao 
Structural aspects 
Coaction 
Subservient (57) Pip 
Strict control (61) _ 
Speech constraint (41) 
Isomorphism i ‘ ihe 3 
Democratic (80) 26 28 22) 
Stratified ah —22* | —34*| —22* — 25" —24* —27* 
Egalitarian (67) 20 32*) 40’ 
Organization 
Goal direction (80) 41*| 25% 26* eo 24* 
Disorganized (55) —21 A = 
Formality (51) 23 23) 
Goal diversity (64) —19* 
Affective aspects 
Syntality 
Classroom intimacy (79) 6 
Alienation (75) 
Group status (68) 
Synergism 
Satisfaction (53) 30*| 24%) 18 y a 
Friction (86) —31* | —24* ~23*| —27 Hi fe 
Personal intimacy (58) 20* 20 
Miscellaneous 
Social heterogeneity (79) | —23* —27* 5 
Interest heterogeneity 21 —23* - 
(61) 


= STR ‘ven 
Note.—Decimals and correlations below the -10 significance are omitted. Test reliabilities are give! 


in parentheses, decimals omitted, 
an = 96. 
bn = 76. 
on = 82. 
dn = 82, 
*p < .05. 


pretest), were calculated for each of the criteria. 
These scores represent the student’s learning on 
each criterion during the course adjusted for initial 
status. The adjusted criteria were correlated with 
each of the 18 measures of classroom climate. 


Resvunts 


Table 1 contains 32 statistically signifi- 
cant correlations (p < 05) between meas- 
ured perceptions of classroom climate and 
the adjusted learning variables. This 
amounts to four times the chance expect- 
ancy in a 9 X 18 matrix of 162 ele- 


ments.2 The estimates of association | 
conservative since the criterion-test ree 
are not highly reliable, and adjuste ih 
scores are even less reliable. Using I 
conservative attenuation correction for w 
*Stepwise multiple correlations were eee oe 
lated with significant results accounting "i ening 
40% of the uncorrected variance in the ae 
criteria with three predictors. However, 0 fot 
of the small number of cases, the ae ne 
the stepwise procedure without cross-va rape 
and great number of beta weights, the resu 
not reported here. 
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reliability of the criterion test only, the 
correlations rise from 7% to 28%. The 
correlations rise from 16% to 200% when 
corrected for criterion and predictor un- 
reliability. Since the question of reliabil- 
ity of gain scores is still unsettled (see 
Harris, 1963), this further correction for a 
third source of error variance is not con- 
sidered here, In any case, uncorrected cor- 
relations and scale reliabilities are shown 
in Table 1. The interested reader is re- 
ferred to Guilford (1954) for attenuation- 
correction formulas. 


Discussion 


While the study is exploratory and em- 
ploys a preliminary form of the instru- 
ment, the results are statistically signifi- 
cant and meaningful enough to warrant 
interpretation. One way to do this is to 
characterize the perception of classroom 
climate for students who made greatest 


| gains on the different criteria by examining 


the columns of correlations in Table 1, 
Students who gained the most on the 
Physics Achievement Test, for example, 
Perceived their classes as socially homo- 
8eneous, intimate groups working on one 
g0al; one might speculate that the goal is 
high achievement on physics tests. On the 
other hand, students who grew more in 
Science understanding saw their classes as 
Well organized with little friction between 
their fellow students, and although the 
tlass is seen as egalitarian and unstratified, 
the students had a greater variety of 
Interests. Thus, different perceptions of 
classroom climates are associated with 
different kinds of cognitive growth— 
achievement and science understanding. 
Perceptions of climate also predict the 
affective growth the course is intended to 
bring about. The correlates of only one of 
@ two ratings for each of the three 
Concepts reported in Table 1 are discussed 
ete. Students who reported greater en- 
joyment of laboratory work perceived 
elr classes as unstratified, democratic in 
Policy setting, having a clear idea of class 
S0als, and satisfying. Students who gained 
€ most interest in physics saw their 
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classes as well organized and unstratified. 
Those who rated the concept Universe 
more friendly saw their classes as having 
clear goals, democratic in policy setting, 
egalitarian, unstratified, and as having less 
internal friction and speech constraint. 
Finally, students who reported engaging 
in more physics activities, because they 
were interested, felt. more personally inti- 
mate with their fellow class members, less 
alienated, and less strictly controlled. 
Thus, students with various perceptions 
of classroom climate grow in different 
ways during a course. Another way of 
examining the results is to analyze the 
correlations across the rows to determine 
which climate variables correlate most 
often with student learning variables. The 
structural climate variables can be divided 
into three subgroups: those having to do 
with “coaction,” “isomorphism,” and “or- 
ganization.”® An enormous amount of re- 
search has investigated “teacher-centered” 
versus “student-centered” classrooms or 
other variations on the themes of “authori- 
tarian” and “dominant” teaching methods 
(Gage, 1963). However, most of the stud- 
ies, whether they employ tabulations of 
systematic observations or observer rat- 
ings, fail to significantly account for 
variance in student learning. Three “co- 
action” climate variables—subservient, 
strict control, and speech constraint—seem 
to be related to this dimension; and among 
the three, there is only one correlation 
with student learning. Bas 
On the other hand, a more promising 
dimension for predicting learning is “iso- 
morphism,” or the perceived equality of 
class members. Democratic, stratified, and 
egalitarian correlate significantly with 
learning in 11 instances. Stratification cor- 
relates with six learning measures, more 


* These terms are used as a matter of conven- 
jence in discussing the results. Except in the case 
of “syntality,” a term employed for some years 
by Cattell (see Bereiter, 1966), the authors re- 
frain from using or adding new terms to the 
copious jargon of psychology. In all other cases, 
the authors have used words with dictionary 
definitions which are to be understood as opera- 
tionally defined and discussed here. 
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than any other variable. Perhaps like 
penal or military institutions, learning can 
be at least partially satisfying and ef- 
fective in dominated, oppressed groups as 
long as everyone is treated equally. It 
may be that when one inmate, rookie, or 
student is unfairly favored or set above 
the other, the energies of the group are 
diverted from the attainment of institu- 
tional or private goals into the resulting 
dissention. 

Another group of structural measures 
that predict learning have to do with 
“organization” of the class—goal direction, 
disorganization, and formality. This group 
calls to mind Ryans’ (1960) “Teacher 
Characteristics Pattern Y’—responsible, 
businesslike, systematic teacher behavior. 
A previous study (Walberg, 1968a) showed 
that these climate variables can be pre- 
dicted from teacher personality. Among the 
organization measures, there are eight cor- 
relations with learning criteria. 

Affective climate predictors can be 
grouped into “syntality” and “synergism” 
measures, One might derive from political 
theory the hypothesis that, like national- 
ism which promoted modern states, “syn- 
tality,” or emotional identification with a 
group cause, enhances learning. Such does 
not appear to be true of the class, however; 
with one exception, the “syntality” meas- 
ures—group status, classroom intimacy, 
and alienation—do not predict the cri- 
teria. Waller’s hypothesis (1932) that stu- 
dents identify with the school through 
competitive extramural sports, pep rallies, 
social clubs, and the like may prove more 
fruitful empirically, 

The climate measures of “synergism,” 
the personal (or what some psychologists 
have termed the “psychodynamic” or 
interpersonal”) relations between class 
members, do predict learning. These 
variables are personal intimacy, friction 
and satisfaction, and they account for 12 
correlations with the criteria. Thus, it is 
not the identification with the group that 
correlates with learning but the percep- 
tion that the class is personally gratifying 
soe without hostilities among the mem. 
ers. 
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Summary AnD ConcLusions 


This is one of a series of exploratory 
studies derived from a sociopsychological 
theory of the classroom as a social system 
(Getzels & Thelen, 1960). In a national, 
nonrandom sample of 76 high school 
physics classes, it tested the hypothesis that 
individual perceptions of 18 structural and 
affective aspects of classroom climate 
predict 9 cognitive, affective, and behay- 
ioral learning measures adjusted for initial 
differences. Simple and multiple correla- 
tion revealed significant and complex rela- 
tions between climate measures and learn- 
ing criteria. For example, stratification and 
friction predicted science understanding, 
but other climate variables predicted phys- 
ies achievement and attitudes toward 
laboratory work. 

In addition, groups of climate variables 
predicted learning better than others. 
Among the structural variables, “isomor- 
phism” (the tendency for class members to 
be treated equally; see Discussion for 
further explanation) and “organization” 
(efficient direction of activity) predicted 
learning much better than “coaction” (com- 
pulsive restraint or coercion). Among the 
affective climate measures, “synergism” 
(personal relations among class members) 
predicted learning better than “syntality’ 
(identification with group goals). 

Replications of the entire series of stud- 
ies are being carried out with revised 
and, hopefully, more reliable instruments 
using a national random sample. Should 
the results hold up in other samples, 
especially in other school subjects, they 
should increase understanding of the social 
psychology of the class. Moreover, from 
a practical point of view, the ability to 
predict learning outcomes from assess 
ments of classroom climate may have 1- 
plications for teacher education, nen 
modification of in-service teachers, a 
the assessment of teaching effectiveness, 
provided educators can agree on measur 
able goals of education. 
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WORD RECOGNITION BY CHILDREN OF TWO 


AGE LEVELS’ 


JAMES W. HALL 
Northwestern University 


2 word lists were presented aurally to kindergarteners and third grad- 
ers. List 1 was presented under free learning instructions with half the 
Ss at each age pronouncing words aloud after presentation. For List 2, 
Ss reported whether each word had occurred in List 1. List 2 included 
10 “new” words which were associates of List-1 words and 10 which 
were not (experimental—EX and control—C words, respectively). 
5 List-1 words were repeated in List 2. More EX than C words were 
falsely recognized as having occurred in List 1, indicating that EX 
words occurred as implicit associative responses (IARs) during pres- 
entation of List 1. IAR-produced false recognitions were more fre- 
quent for younger than for older Ss and for pronouncing compared 
with nonpronouncing Ss. Recognition of repeated words was facilitated 


by overt pronouncing for kindergarteners. 


There is evidence that when a single, 
familiar word is presented to a human S, 
at least two types of implicit responses may 
occur. One is the response involved in the 
act of perceiving the word, and has been 
termed the “representational response” 
(RR) by Bousfield, Whitmarsh, and Da- 
nick (1958). A second response has been 
called the “implicit associative response” 
(IAR) by Underwood (1965) and may be 
conceived as a second word elicited as an 
internal response by the stimulus properties 
of the RR. For example, if scissors is 
presented, cut may occur as an IAR, Un- 
derwood has demonstrated that the occur- 
rence of an JAR in this fashion may lead 
an S to erroneously identify the word that 
occurred as an IAR (cut in this case) as 
having been presented, 

In a more recent study, it was found 
that IAR-produced false recognitions oc- 
curred in young children but were more 
frequent in five- and six-year-olds than 
in eight- and nine-year-olds (Hall & Ware, 
1968). This age difference was contrary 
to expectations based on the notion that 
the older children would be more likely to 
produce IARs, thus more likely to become 


+ This study was supported in part by a grant 
from the United States Office of Education. The 
author is grateful to Paul J. Avery, formerly 
Superintendent of Schools, and to Willis C, Mor- 
tensen, Principal of Crow Island Elementary 
School, in Winnetka, for their cooperation in pro- 
viding Ss and facilities. 


confused between the word presented to 
them and the [AR to that word. 

One purpose of the experiment reported 
here is simply to replicate the age-differ- 
ence finding of the Hall and Ware study. A 
second major purpose is related to a hy- 
pothesis regarding this age difference. More 
specifically, it is proposed that the older 
children make fewer IAR-produced false 
recognitions than do the younger ones, not 
because the older children produce fewer 
IARs, but because their ability to dis- 
criminate between RRs and IARs is greater 
than that of the younger children. If this is 
the case, then anything that affects dis- 
criminability of RRs and JARs should in- 
fluence frequency of IAR-produced false 
recognition. One such variable may be ey 
overt pronunciation of the presented word. 
Thus, in this experiment approximately ha 
of the Ss at each level were required to s9Y 
aloud each word presented to them, and the 
remainder were given no instructions Te- 
garding pronunciation of the words. 


MerHop 


Subjects 


The Ss were 40 kindergarten children (7 ah 
and 23 girls) with mean chronological faa and 
years, 10 months and 40 third graders (20 boy 
20 girls) with mean chronological age © 
11 months, enrolled in a public elementary 
in Winnetka, Illinois. 


420 


, 


| 


Worp Recognition sy Curpren or Two Ace Levers 


TABLE 1 
Just-1 (Fres-LeaRrnina) Worps Usep anp THEIR 
Functions 
Word Function Word Function 
at F salt cs 
chair F eagle R 
money F girl R 
slide F gallop cs 
king F bed Cs 
salt cs baby R 
eagle R thirsty cs 
gallop cs eating cs 
thirsty cs clear R 
girl R scissors cs 
scissors cs spider cs 
bed Cs fingers Cs 
baby R pretty R 
eating cs blossom cs 
pretty R lamp cs 
spider cs mouth F 
lamp cs pencil F 
clear R look F 
fingers cs window F 
blossom cs coat F 


Note.—Abbreviated: F = filler, CS = critical 
stimulus, R = repeated. 


Design 


The design called for the presentation of one 
word list under free learning (FL) instructions, 
followed by a second list under recognition instruc- 
tions, List 1 is shown in Table 1; List 2, in Table 
2. The words are listed in the order in which they 
were presented, with the function of each word in- 
dicated beside it, Included in List 1 were 10 
critical stimulus (CS) words, each of which has 
been shown to elicit a particular response with Tela- 
tively high frequency when standard word-associa- 
tion procedures are used. These high-frequency 
Tesponses to the CS words, presumed likely to 
Occur as TARs, were placed in List 2 as experimen- 
tal (EX) words. In recent word-association data 
(Palermo & Jenkins, 1966; Entwisle, 1966), the 
Mean frequency with which the 10 EX words used 

ere were elicited by their respective CS words 
Was 43.7% for kindergarteners® and 45.4% for third 
graders, 

Also included in List 1 were five repeated (R) 
Words, so termed because they also appeared in 
List 2. To reduce learning differences within List 1 
due to serial position, 10 filler words occupied the 
first and last five positions in List 1. Each CS word 
and each R word appeared twice within List 1. 

List 2 contained the 10 EX words, 10 Control 
= 


. ‘Because the Palermo and Jenkins norms do not 
Include data on kindergarteners, frequency esti- 
mates for 7 of the 10 critical response-experimental 
Pairs were based on first-grade data. 
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(C) words, the five R words, and six new filler 
words. The C words were similar to the EX words 
in general frequency of occurrence (Thorndike & 
Lorge, 1944) but were not strong associates of any 
List-1 words. Thus, if false recognitions were more 
frequent for EX than for C words, it would be in- 
ferred that this difference was due to the previous 
occurrence of EX words as IARs. 


Procedure 


The procedure for each S (run individually) 
consisted of aural presentation of List 1 at a 5- 
second rate. Then, after a 7-minute interval, the 
List-2 items were presented at a 4-second rate. The 
Ss were instructed to respond “yes” if a word had 
occurred on List 1 and “no” if it had not. The 7- 
minute delay between learning and recognition was 
used to increase the difficulty of the recognition 
task. To prevent rehearsal during that interval, Ss 
were occupied with jigsaw puzzle tasks. All instruc- 
tions and words were presented by use of a tape 
recorder. 

During FL, an experimental variation in instruc- 
tions was introduced. Twenty-four of the younger 
and 23 of the older Ss, selected randomly, were in- 
structed to pronounce each word aloud after it had 
been presented, and to attempt to remember the 
word. The remaining 20 younger Ss and 28 older 
Ss were instructed identically except that no re- 
quest for pronunciation was made. All Ss followed 
these instructions properly. For purposes of analy- 
ses, random procedures were used to exclude 4 Ss 
from one group and 3 from each of two others, 
equalizing the Ss at 20 per group. 


TABLE 2 
List-2 (Recoenition) Worps Usmp anp THErR 
Functions 
Word Function Word Function 
girl R house F 
needle F sleep EX 
train F pall . 
epper EX go) 
ae Cc food EX 
lion Cc clear R 
horse EX web EX 
eagle R run Cc 
hair F baby R 
lazy F church C 
tall Cc hand EX 
cut EX flower EX 
water EX south Cc 
car Cc light EX 
pretty R receive C 
read F 


Note.—Abbreviated. R = repeated, F = filler 
EX = experimental, C = control. 
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RESULTS 


False Recognitions 


In Table 3 the mean numbers of false 
recognitions per S of the EX and C words 
are shown separately for each age level 
and each pronunciation condition. Using a 
difference score (EX — C) for each S, a 
t test for correlated data showed the overall 
mean of the differences (X = .81, SD = 
1.30) to be highly reliable, t = 5.56, df = 
79, p < .001. This is interpreted as con- 
firming earlier results by Underwood 
(1965), Davis (1967), and others in show- 
ing that EX words frequently are elicited as 
IARs during learning, resulting in their 
subsequent false recognition, 

The EX — C difference scores then were 
used to examine the effects of age and pro- 
nunciation instructions on frequency of 
IAR-produced false recognitions. Analysis 
of variance showed the main effect of age 
to be highly reliable, F = 7.48, df = 1/76, 
p < 01. That is, the frequency of IAR- 
produced false recognitions was higher for 
the younger than for the older Ss. The 
main effect of instructions to pronounce 
also was highly reliable, F = 10.89, df = 
1/76, p < 01, indicating that overt 
pronunciation reduced the frequency of 
IAR-produced false recognitions. The inter- 


action between these variables was not sig- 
nificant. 


TABLE 3 


Fatsp Recognition or EXpuRiMEnTaL 
Conrrot (C) Worps anp Correct “Seegg 
NITIONS OF REPHATED (R) Worps 


Tnstruction 


Overt pronun- 
ciation 
X per § 
SD 


No overt pro- 
nunciation 
X per 8 
SD 


James W. Hatt 


Correct Recognitions 


The mean numbers of correct recogni- 
tions per S of the R words also are shown 
in Table 3. Since the data were charac- 
terized by marked skewness and nonhomo- 
geneity, nonparametric analyses were per- 
formed in which groups were compared in 
terms of the number of Ss who correctly 
recognized all five R words. On this basis, 
the two third-grade groups and the kin- 
dergarten Ss under overt pronouncing in- 
structions were quite similar. The mean 
numbers of perfect scores in these three 
groups were 13, 11, and 15, respectively. Of 
the nonpronouncing kindergarten Ss, how- 
ever, only 3 of the 20 correctly recognized 
all five R words. For the kindergarten Ss, 
the difference in this respect between the 
two pronouncing conditions was highly re- 
liable, x2 = 12.22, df = 1, p < .001, while 
the corresponding difference between the 
two third-grade groups did not approach 
significance. Clearly, the instructions to 
(pronounce did increase the frequency of 
correct recognitions of R words by the 
younger Ss, Although no similar effect was 
found for the third graders, it should be 
noted that the performance level of the 
nonpronouncing third graders was so close 
to maximum that the possibility of a ceil- 
ing effect must be considered. 


Discussion 


False Recognitions 


ternative seems unlikely in view 4 
dence from other lines of research ne 
common conceptions of verbal develorr 
in children. It is generally believed t 
implicit verbal behavior increases iat 
edly from about 3-8 years of age es 
Kendlers (e.g., 1963), in particular, 4® 


i 
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provided considerable experimental evi- 
dence that verbal mediation increases sub- 
stantially during this period. Presumably, 
IARs are frequently involved in such me- 
diation, so that one would expect their oc- 
currence to be more frequent in the older 
children, not less frequent. Of course, one 
might speculate that there is something 
about the particular experimental situation 
that inhibits the production of [ARs by the 
older children, and that under other cir- 
cumstances false recognitions would be 
greater for the older children. However, 
there is another alternative that appears 
more plausible, at least as a working 
hypothesis. 

Perhaps as a child grows older (5-8 
years, say) he does become more produc- 
tive of IARs, but, at the same time, he 
also becomes better able to discriminate 
between words that were presented and 
words that he was reminded of. The basis 
for discrimination is not clear. One possi- 
bility involves the frequency hypothesis 
proposed by Ekstrand, Wallace, and Un- 
derwood (1966). This notion is that the 
RR occurs with greater frequency than 
does the IAR. That is, S probably says the 
presented word silently several times while 
an IAR is unlikely to be rehearsed. This 
frequency difference, then, may be the 
basis for S’s ability to respond correctly 
during recognition, that is, to successfully 
distinguish between the TAR and the RR. 
Possibly the older Ss rehearse the pre- 
Sented words more than do the younger Ss, 
So that the frequency discrepancy on which 
discrimination is based increases with age. 

As predicted, pronunciation of the words 
during FL reduced false recognitions that 
Were attributable to IAR occurrence. The 
Interpretation favored by the author is 

nat pronunciation of the words increased 
discriminability of those words from the 

Rs which they elicited, although the 

asis for increased discriminability is un- 
| ‘lear. Again, frequency may be involved 

if it is assumed that the speaking of a word, 
| 4 least for some Ss, adds to the number 
of implicit rehearsals that the word re- 
“elves, producing a higher frequency of oc- 
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currence than if the word had not been 
spoken. It also could be argued that the 
time taken to pronounce the word left less 
time for IAR production than was avail- 
able for Ss who did not pronounce the 
word. Thus, it simply may be that fewer 
TARs were made by Ss who pronounced. 
These, as well as other possibilities, cannot 
be evaluated at present. 


Correct Recognitions 


In the correct-recognition data one find- 
ing stands out—the striking effects of in- 
structions to pronounce on the performance 
of the younger children. The importance of 
pronouncing responses in a learning task 
of this type is well documented (eg., 
Mechanic & D’Andrea, 1965). The present 
data suggest that with the younger chil- 
dren explicit instructions to pronounce 
markedly increase pronouncing and thus 
learning. It is not clear whether the pro- 
nunciation would need to be overt, as it 
was in the present instance, rather than 
covert in order to obtain this effect, 

The fact that for the older Ss only a 
slight difference occurred between the pro- 
nouncing and nonpronouncing groups may 
have been due simply to a ceiling effect. 
However, an alternative worth examining 
further is that by the time a child is 8- or 
9-years-old, normal learning instructions 
produce covert pronouncing responses so 
regularly that explicit pronouncing instruc- 
tions are superfluous. 
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EFFECTIVENESS OF LEARNING FROM A PROGRAMMED 
TEXT COMPARED WITH A CONVENTIONAL TEXT 
COVERING THE SAME MATERIAL 


WILLIAM J. DANIEL! 
University of North Carolina at Chapel Hill 


AND 


PETER MURDOCH 
University of Iowa 


Studying a programmed textbook was compared with studying a 
conventional textbook to determine which method leads to better 
performance on a content examination. Both texts covered similar 
material on operant psychology. In a setting where 12 teaching 
assistants each taught 2 discussion sections with enrollments of about 
22 students, 1 section studied the programmed text, and the other 
studied the conventional text. At the end of the semester, all students 
took an examination which contained several types of items. The 
results of the experiment favored the programmed text in that, for all 
6 objective types of items and for 5 out of 8 essay items, the level of 
performance of the programmed text group was higher than the con- 
ventional text group (p < .05 or better from the analysis of variance). 


One type of popular research on pro- 
grammed instruction compares the effec- 
tiveness of learning from a particular pro- 
grammed text or teaching machine with 
conventional material covering the same 
topic. The present study compares a pro- 
grammed text on operant psychology (Hol- 
land & Skinner, 1961) and a comparable 
conventional text (Skinner, 1953) in an 
introductory psychology course. The rela- 
tive effectiveness of learning from the two 
texts was measured by the final examina- 
tion of the course. 

Leith (1962) reviewed the literature on 
the effectiveness of programmed versus 
comparable regular material. He concluded 
that programmed texts save time over con- 
Ventional texts but do not lead to better 
Performance on a content examination. 
One of the few consistent findings in the 
=’ 

‘The late William J. Daniel initiated this re- 
Search but died before the data analyses were 
Completed or any of this report was written. The 
Second author wishes to thank: Robert Hall for 
typing and mimeographing the examination, clas- 
sifying the items into their types, organizing the 
Scoring of the examinations, and helping on part 
of a data analysis; Darrel Bock for providing 
Statistical advice without which this report could 
Re have been written; Robert Callahan and 
ee Wells for critically reading an earlier 
ae of the paper; and the instructors, Carolyn 
‘ardall, John De Lorge, John Delk, Amy Kimura, 
ie Moore, Elizabeth Rose, William Rouse, 
Rc ard Sanders, Richard Sprott, Jane Webb, and 
Oger Wells, 


research on programmed materials is that 
learning, as measured by an examination, 
is about the same for the programmed and 
conventional materials. However, the fail- 
ure to demonstrate greater effectiveness for 
the programmed oyer the conventional ma- 
terials may be due to methodological weak- 
nesses in the previous research (Evans, 
1965; Holland, 1965). Among the methodo- 
logical problems, the following have been 
suggested: programs too short; gross 
dependent measures; small sample size; or- 
der and transfer effects not controlled; dif- 
ferences in information, procedures, in- 
structions, and institutions not controlled 
when comparing the two types of material; 
criterion test too easy; and allowing Ss to 
take home the materials (Paulus, 1966). 
In a study which avoided these weaknesses, 
Ripple (1963) compared the effectiveness 
of four groups taught the same material, 
where two groups were taught with pro- 
grams and two were taught by some more 
“eonventional” means, Tests of retention 
were given immediately after study of the 
material and 10 days later. The scores on 
the first test by both program groups were 
significantly higher than those of the con- 
yentional-method groups, but on the sec- 
ond test this difference was maintained for 
only one of the program groups. Thus, 
when methodological weaknesses are 
avoided, material which is programmed 
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may be learned better than the same ma- 
terial presented in conventional format. 
The experiment reported here was per- 
formed during a semester of a college 
sophomore-level psychology course, with 
laboratory and discussion sections taught 
by graduate teaching assistants. The stu- 
dents were told they were research par- 
ticipants only after they had taken the 
final examination of the course. Because 
each instructor taught two sections, it was 
possible to assign the programmed text to 
one of his sections and the regular text to 
his other section. All students took the 
same final examination, which consisted of 
several types of items, A multivariate anal- 
ysis-of-variance computer program (Bock, 
1965, 1966) was used to assess the effects 
of the experimental variables on the sev- 
eral types of items. 


Merxop 


Subjects 


The $s were 577 students’ enrolled in an intro- 
ductory psychology course at the University of 
North Carolina at Chapel Hill. 


Procedure 


At course registration, the students were auto- 
matically assigned to 1 of 24 laboratory and dis- 
cussion sections which met once each week for 2 
hours during the semester. Five female and seven 
male graduate-student instructors each taught two 
of these sections, with enrollments ranging from 
18 to 26 undergraduate students. The book by 
Holland and Skinner (1961) was randomly as- 
signed as a required text to one section of each 
instructor, and the book by Skinner (1953) was 
assigned to his other section. The students in a 
particular section used only the text assigned to 
them, The material covered in the sections was 
partly left up to the teaching assistant, provided 


* Two students were suspected of cheating on 
the examination, and their responses were dis- 
carded, leaving 575 Ss for the data analyses. 

* After their examination, the students were Te- 
quested to complete a 15-minute questionnaire, 
They were to indicate which text(s) they had 
studied, how much time they devoted to their 
text(s) relative to others, and how they liked 
their text(s) relative to others. These data will be 
presented in a supplementary report. Some students 
indicated that they had studied both the pro- 
grammed and conventional texts. The data from 
these Ss were included in all analyses, 
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he did the following: gave instruction on method- 
ology in psychology and elementary statistics, 
demonstrated research to the class, organized the 
class to perform research, and tested the students 
by at least one midterm examination covering the 
above material and about half of the assigned 
textbook. The instructors met about once a week 
with the senior author to agree upon the work 
to be covered in class. 

At the end of the course, all the students were 
tested on operant psychology by means of a 100- 
item examination. The test items came from a 
larger pool of items contributed by the teaching 
assistants, four of whom selected the items for the 
final examination. The first 92 items were objective, 
but there were 6 different types of objective items. 
The last 8 items were of the essay type. 

The objective item types were the following: 

Multiple-choice format. In this type of format, 
with a stem and four alternatives, the student 
picked out the one correct alternative. (a) Mul- 
tiple-choice-A (MCA) tested knowledge of spe- 
cific content (25 items). The MCA items were 
based on the more specific details of operant psy- 
chology as treated in the texts, such as how to read 
a cumulative record or the differences among the 
basic schedules of reinforcement. (b) Multiple- 
choice-B (MCB) tested knowledge of concepts 
and principles (24 items), The MCB items were 
based on the more general concepts and principles 
covered in the texts, such as the analysis of emo- 
tion or punishment. (c) Multiple-choice-C (MCC) 
tested response to new material (10 items). The 
MCC items required the students to apply the 
operant psychology approach to examples not 
fully covered in the texts, such as instilling certain 
behaviors or describing novel response differen- 
tiations. (d) Multiple-choice-D (MCD) tested 
for applications to everyday life (11 items). The 
MCD items required the students to apply the 
operant psychology approach to everyday situa- 
tions not fully discussed in the texts, such 48 
driving a car or working for wages. 

Free-recall format. In this type of format the 
student filled in the blank space(s) in a sentence 
with one or two words. (a) Free-recall-C (FRC) 
tested response to new material (11 items). (b) 
Free-recall-D (FRD) tested for applications 5 
everyday life (11 items). The bases of the HRS 
and FRD items were, respectively, like those 0 
the MCC and MCD items. sarin th 

The differences among the item types within ay 
multiple-choice format and within the free-ree 
format were difficult to maintain. This was due #0 
the fact that all teaching assistants wrote item! 
Without a clear knowledge of the differences ae 
the types. It was felt, nevertheless, that the ee 
gories were sufficiently established to treat 4 
examination as providing six objective item pe 
scores for each 8. The different kinds of obje¢ 
items were randomly ordered on the test. half 

The essays were to be answered in a 
8 page of writing per question, such as discu 
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the Skinnerian analysis of the value of money or 
the development and treatment of fears. 

The students’ responses to the objective and 
essay items were scored by their instructors. The 
former were scored with the same punch-key 
answer sheets. Before the essay material was 
graded, the instructors met and developed common 
scoring criteria. Without knowing the student’s 

name, the instructors used a 4-point scale (0-3) 
to evaluate each essay written by their students. 


Data Analyses 


The mean number of correct items, per section, 
on the six types of objective items (total N = 
144) were analyzed in an analysis-of-variance de- 
sign with three factors. In addition to overall mul- 
tivariate F tests, univariate and step-down F tests 
were performed for each dependent measure for the 
sex-of-instructor effect, the between-instructor- 
within-sex effect, and the textbook effect. Text- 
book-by-instructor-within-sex was used as the error 
term in testing the significance of these effects. 
Before analysis, the order of the dependent vari- 
ables was specified in terms of their likelihood of 
accounting for the textbook effect. Based on pre- 
Vious research, the order was MCA, MCB, MCC, 
FRC, FRD, and MCD 
_ The mean scores per section for each of the 
eight essays were analyzed in a manner similar 
to the objective item types. However, there was 
No specified order of importance for the essay 
items, so they were ordered as on the examination. 
In addition, the MCA and MCB objective item 
types were added as the first two dependent 
Measures of the analysis (total N = 240). 

Subsequent to the above analyses, reanalyses 
Wete performed on the objective items data and 
the essay items data. 

he objective items were scored differently for 
the essay items analyses than for the objective 
item types analyses, In the latter analyses partial 
credit was given for some answers. However, this 
Sccurred on only a few items, so partial credits 
tended to make the within cell distributions less 
Rormal than without them. On the analyses of the 
rssay items all partial credits were rounded off. 
his accounts for the slight differences in the scores 
of the same two objective item types in the anal- 
—_— 


“The reader unfamiliar with the multivariate 
nd univariate analysis-of-variance approach used 
ere is referred to Bock (1965, 1966). The second 
Paper uses the present study as “An excellent ex- 
ample of a multivariate study using the principle 
x blocking by teacher ... [p. 834].” Unfortunately, 
Re data supplied by Robert Hall to Darrell Bock 
or his analysis were mislabeled and contained an 
Stor in data computation; there are also typo- 
Saphical errors in two of the F ratios reported by 
Ro is accounts for the differences between 
re k’s results and those reported here. 
1997. D. Bock, personal communication, July 29, 
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yses of the objective item types and of the essay 
items. 


Rasvits 


The mean score for each section for each 
type of objective item and for each essay 
item were calculated and analyzed as in- 
dicated. 

The results of the multivariate F’ tests 
for the equality of the mean vectors for 
the six objective item types were as 
follows, There was no evidence of a sex ef- 
fect (p = 300). The between-instructors- 
within-sex effect was statistically signifi- 
cant (p = .010), indicating that the 
sensitivity of the experiment was increased 
by having each instructor teach a section 
which used the programmed text and a 
section which used the regular text. The 
textbook effect was also statistically sig- 
nificant (p = .002), which indicates that, 
as a set, the objective measures differed 
according to the kind of textbook studied. 

The nature of the textbook effect is pre- 
sented in Table 1. For each of the six objec- 
tive item types, the mean number of correct 
items for the programmed text group 
minus the mean number of correct items 
for the conventional text group is positive. 
The sizes of the mean differences and the 
standard errors of the contrasts are given 
in Table 1. The six univariate F ratios for 
the textbook factor were all statistically 
significant (p < .05 or better). Thus, the 
univariate tests supported the hypothesis 
that test performance of the programmed 
text group is better than that of the con- 
ventional text group. 

The step-down F tests indicated that 
when the effects of MCA (knowledge of 
specific content) were eliminated statisti- 
cally from the other five dependent meas- 
ures, the differences in test performance 
on the objective item types between the 
two text groups were no longer statistically 
significant (all p’s > .05, Table 1), One 
interpretation of the step-down F' ratios is 
that knowledge of specific content is 
learned better from a programmed text 
than a conventional text. Furthermore, this 
superiority transfers to enable the student 
who studied the programmed text to do 
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TABLE 1 


Comparison or EXAMINATION PERFORMANCE OF THE PROGRAMMED VERSUS THE CONVENTIONAL Tex? 
Group on Six Osszortve Irpm Types 


Source 
MCA 

Text contrast (programmed- 

conventional text) 

Mean difference 1.70 1.63 

Standard error of difference 21 +35 
Mean square* 17.33 15.90 
Univariate F 67.18** 21 
Step-down F 67.18** 


Item type 
McC FRC FRD McD 
-36 1.79 1.06 86 
-16 +25 15 -08 
-76 19.28 6.74 4.43 
4.85* | 50.98** | 50.94** 106.12** 
12 4.20 3.50 +34 


Note.—Abbreviated: MC = multiple-choice format; A = knowledge of specific content; B = knowl- 
edge of concepts and principles; C = responding to new material; FR = free-recall format; D = appli- 


cation to everyday life. 


sdf= 


better on the more abstract types of ob- 
jective items. For this interpretation to 
hold it must be shown that there are actual 
differences among the several objective 
item types relative to the textbook effect. 
On the other hand, if MCD, the most ab- 
stract of the objective item types, acts 
just like MCA in the multivariate F test, 
the examination is largely homogeneous 
relative to the textbook effect. These con- 
trasting views were tested in an addi- 
tional analysis of the objective items, 
where the order of the dependent variables 
was changed from MCA, MCB, MCC, 


TABLE 2 
Step-Down F Ratios ror Two Orvers or Six 
Ossncrive Irem Typrs ror THE PROGRAMMED 
VERSUS THE CONVENTIONAL 
Tux Contrast 


Order 1 MCA | MCB| MCC} FRC | FRD [McD 


Step-down F | 67.18* | .00 | .12 {4.20 |3.50 34 


Order 2 MCD 


FRD | MCC | FRC | MCB [MCA 


Step-down F |106.12* |2.69 | .87 | .62 | .01 | .53 


Note.—Abbreviated: MC = multiple-choice 
format; A = knowledge of specific content; B = 
knowledge of concepts and principles; C = re- 
sponding to new material; FR = free-recall for- 
mat; D = application to everyday life. 

*p < .0001. ji 


FRC, FRD, and MCD in the first analysis 
to MCD, FRD, MCC, FRC, MCB, and 
MCA. The only values which can be dif- 
ferent in the two analyses are the step- 
down F ratios. Table 2 gives the step- 
down F values for the two orders of the 
objective item types. Knowledge of specific 
content and applications to everyday life 
are both able to account for the textbook 
differences, and, therefore, the objective 
item types were largely homogeneous. — 
The analyses of the eight essay items 
were similar to those performed on the ob- 
jective item types. In the first analysis; 
MCA and MCB were the first two as 
pendent measures, while in the eae 
analysis they were replaced by MCD an 
FRD. In the two analyses the values which 
can differ are the multivariate F' ratios for 
the tests of the equality of the mean ee 
tors, the step-down F ratios, and the ia 
variate F ratios for the objective ite 
types. 
"The results of both multivariate re 
for the equality of the mean vectors e 
the essay items were similar to those G 
the six objective item types. There was sf 
ho statistically significant sex eft 
422 in the first analysis Ae Po=H 
the second analysis) and aula 
significant between-instructors-withit 
effect (p = .001 for both analyses)- 


tatistically — 


| 
| 
| 
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TABLE 3 


CoMPARISON OF EXAMINATION PERFORMANCE OF THE PROGRAMMED VERSUS THE Conventionat Tuxr 


aS 


Grovur on Two Oxszcrive Irem Typus anp Ercur Essay Treas 


Objective Essay 
Source 
MCA MCB a 2 3 4 5 6 7 8 
' Text contrast (programmed- 
conventional text) 

Mean difference 1.63 1.70 29 37 31 -18 |—.20) .24) . 
Standard error of difference | .35 | 21 | ‘07 | ‘08 | ‘05 | lor | <4 “ual coe | cog 
Mean square® 17.33 15.90 49 81 58 +20] .25) .383) .94 | .17 


Univariate F 


Step-down F 67.18***| .00 


67 .18***/21 .60**|14.74**/18.67**135.54***17 23% 
65 


-14/4.66/21 .95**/4,49 


8.96 | 5.68* | .05| .06/1.90] 2.50 | .00 


Note.—Abbreviated: MC = multiple-choice format; A = knowledge of specific content; B = knowl- 


edge of concepts and principles. 
“df = 1/11, 
*p < .05. 
*n < .005. 

#4 < 0001. 


ever, the textbook effect was short of sig- 


nificance in the first analysis (p = .088), 
while it was significant in the second anal- 
ysis (p = .027). 

The nature of the textbook effect for the 
essay items is presented in Tables 3 and 4. 
As can be seen from Table 3, which pre- 
sents the textbook contrasts, the hypothesis 
of greater learning from the programmed 
text than the conventional text is supported 
by the univariate F tests. For seven of the 
eight essays, the mean score for the con- 
ventional text is positive, and five of the 
televant F ratios were statistically signifi- 
cant (p < .05 or better). 


Table 4 presents the step-down F ratios 
for the two analyses of the essay items. 
When the effects of MCA are statistically 
removed from the remaining nine de- 
pendent measures, only for essay Number 3 
does the difference in test performance 
between the two text groups remain sta- 
tistically significant (p < .05). When the 
effects of MCD are statistically removed 
from the other nine measures, the textbook 
contrasts are all statistically nonsignificant 
(p > .05). Taken together these two analy- 
ses suggest that, for the essay items, 
knowledge of specific content and appli- 


TABLE 4 . 


Stmp-Down F Ratios ror ANALYSES wiTa Four Ossuctive rem TYPES AND THE Ereut Essay Ivpms 
FOR THE PROGRAMMED VERSUS THE CONVENTIONAL TEXT ConTRAST 


Objective Essay 
ree MCA 2 3 4 5 6 uf 8 
Step-down F 67.18** | .00 “65 | 8.96 5.68" | .05 | .06 | 1.90 | 2.59] .00 
Essay 
si 3 4 5 6 7 8 


‘eqn °te—Abbreviated: MC = multiple 
8e of concepts and principles; D = appli 
4? S 05. 
P< 0001. 


* 


i i ;B= |. 
-choice format; A = knowledge of specific content; B = know! 
‘lication to everyday life; FR = free-recall format. 
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cations to everday life are not both able to 
account for the textbook differences. 


Discussion 


Consideration of the univariate F tests 
supported the hypothesis of greater effec- 
tiveness of learning from a programmed 
text over a conventional text covering the 
same matter. Of the 14 contrasts of test 
performance of the programmed minus the 
conventional text group, 13 were positive 
and statistically significant (p < .10 or 
better). Nine of the F ratios relating to 
these differences were significant at p < .01 
or better. For the one item on which the 
contrast was opposite to prediction, the 
difference was statistically nonsignificant 
(p > .10). 

The objective test items were similar in 
format to the programmed material. It 
could thus be argued that the greater ef- 
fectiveness of learning from the pro- 
grammed over the conventional text was a 
function of transfer between text and test, 
However, since the essay items were simi- 
lar in format to the conventional material, 
use of the transfer argument leads to the 
prediction that on the essays, the con- 
ventional text group will perform better 
on the examination than the programmed 
text group. This clearly was not the case, 

Consideration of the step-down F analy- 
Ses suggested that, relative to the treatment 
effects, the objective item types did not 
differ among themselves, The essay items 
appeared to differ from the MCA objec- 
tive item type but not from the MCD ob- 
jective type. The essay items were intended 
to be the most abstract type of item, but 
as it turned out, they were comparable to 
the MCD objective item type. Because the 
test item types were largely homogeneous, 
and because there was no clear & priori 
theoretical basis for differentiating among 


them, it was not possible to establish dif- 
ferential effects of 


to the textbook 


Wuuiam J. Danie anp Peter Munpoce 


is learned better from a programmed text 
than a conventional text and that this rela. 
tive superiority transfers to more abstract 
item types). 

In research of the present type there is 
a problem concerning the extent to which 
the programmed and conventional texts 
are comparable. That is, it is not known 
to what extent they cover the same content, 


are equally difficult, are equally interesting, — 


ete. The textbooks used here were written 
by the same author for the same purposes, 
so the books should be comparable in 
style, difficulty, and content, etc, The in- 
structors were very familiar with both 
books and were asked to judge the com- 
parability of the books by means of a 
suitable questionnaire. In addition, they 
indicated whether the examination items 
tended to be based more on one of the 
texts than on the other. According to the 
ratings of the instructors, the texts were 
comparable, and the examination did not 
favor one of the texts. It was felt, there- 
fore, that the present experiment provided 
a fair comparison of the practical useful- 
ness of the two texts by Skinner. The pro- 
grammed text was more effective for 
teaching operant psychology than the con- 


ventional text. More specifically, compared 


with the conventional group, the program 
group on the average obtained a 10% 
higher score on each multiple-choice item 
type and a 7% higher score on each essay. 
Lumsdaine (1961) has emphasized the 
need for more theoretically oriented re- 
search on programmed materials. While 
the present study presents little theory, it 
does provide a sophisticated approach for 
discovering what Lumsdaine calls at 
and contingent generalizations. An examp. fs 
of an absolute generalization is that te 
performance on all types of items 18 a 
creased more by using the prog 
text than the conventional text. Con 
generalizations express the predicted ¢ 
fects of one factor in relation to pene 
with which it is expected to Ne 
(Paulus, 1966). The present ana 
variance approach provides a method. s 
obtaining empirical support for both kin 


of generalizations. The use of overall mut , 


tivariate effects and similarities among 
univariate effects leads to the establish- 
ment of absolute generalizations, while the 
use of specific univariate and step-down 
effects leads to the establishment of con- 
tingent generalizations. 
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RELATIONSHIP BETWEEN FORMAL INTRALIST SIMILARITY 


AND THE von RESTORFF EFFECT: 


§. JAY SAMUELS 
Department of Educational Psychology, University of Minnesota 


Experiment I: 60 1st graders were randomly assigned to either a high, 
medium, or low stimulus similarity paired-associate list. During 
learning trials 1 stimulus in each list was printed in red and the 
other stimuli were in black. Transfer trials were given after each 
5 learning trials. During learning trials in the high-similarity list 
more (p < 001) correct responses were given to the stimulus in 
red, At transfer there was a reversal of significance, and fewer cor- 
rect responses (p < .02) were given to the stimulus formerly in 
red. On medium- and low-similarity lists the differences in cor- 
rect responses to red and black stimuli were not significant at learn- 
ing or at transfer. Experiment IT: 30 college Ss learned a high stimulus 
similarity paired-associate list. During learning trials 1 stimulus 
was in red. The other stimuli were in black. On transfer trials all 
stimuli were in black. Learning and transfer trials were alternated, 
The results were similar to those found in Experiment I for the high- 
similarity list. It appears that magnitude of the von Restorff effect 
is influenced by stimulus similarity. Failure to find positive trans- 
fer for the stimulus formerly in red was discussed in terms of at- 


tentional factors in learning. 


Since Hedwig von Restorff first per- 
formed her landmark experiment, in which 
she demonstrated the more rapid learning 
of an item in a list which was different 
from other items in the list, much subse- 
quent research has confirmed her original 
findings. The facilitation in learning a 
particular item in a list produced by iso- 
lating that item or making it distinctive 
from other items has come to be called “the 
von Restorff effect.” 

Knowledge of factors which influence the 
von Restorff effect is of theoretical and 
practical significance, Of theoretical im- 
portance is the effect of formal intralist 
similarity on the magnitude of the von 
Restorff effect. Other questions of im- 
portance relate to transfer and attentional 
factors in learning lists containing isolated 
items, An educator might argue that a 
technique which facilitates learning but 


* Appreciation is extended to Joel Best for 
his contributions in Experiment I, and to Jim 
Palmer for his help in Experiment IT. 

This research was supported from grants to 
the University of Minnesota, Center for Research 
in Human Learning, from the National Science 
Foundation, the National Institute of Child 
Health and Human Development, and from the 
Graduate School of the University of Minnesota. 


which produces negative transfer is of 


spurious educational value. In fact, if dur- 


ing learning S attends to an aspect of the 
stimulus which will be removed during 
transfer, then one can predict that any su- 
periority gained during learning would 
probably be lost at transfer. While ques- 
tions of stimulus similarity, transfer, and 
attention are of theoretical interest, they 
are relevant to practical problems in edu- 


cation. For example, several reading prim- | 
ers published in Europe print new words — 


and parts of words in distinctive colors. 
At the University of Pittsburgh a research 
program in the teaching of reading has 
each vowel phoneme associated with 4 
color, and the graphemiec representation of 
the vowel is printed in that color (Kjelder- 


gaard & Frankenstein, 1967). While these 


innovations may facilitate learning, it 18 
important to consider the effect at transfer 
when the incidental cues are removed. \ 

One way to isolate in an experiment 
list is to print an item in a color differen’ 
from the other items. Newman (see Wal- 
lace, 1965) found that when either the 
stimulus (S) or response (R) term ee 
paired-associate (PA) list was isolated i 
color, learning the pair was facilitated. 
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Another way to isolate is to insert an item of 


] high meaningfulness in a list of low-mean- 


ingfulness items. Rosen, Richardson, and 
Saltz (1962) reported that learning was 
acilitated to a greater extent when a high- 
meaningfulness item was inserted in a low- 
meaningfulness list than when it was 
placed among items of high meaningfulness. 
They accounted for their results by noting 
hat in a low-meaningfulness list, in which 
discriminability among items is poor, iso- 
ation serves to differentiate the items. 
Another variable which should influence 
the von Restorff effect in a somewhat simi- 
lar manner as does meaningfulness is for- 
mal stimulus similarity. Formal similarity 
wfers to the degree of visual distinctive- 
hess among items. The purpose of Experi- 
ment I was to determine the effect stimulus 
similarity and stimulus isolation have on 
the von Restorff effect when only a single 
stimulus within the list is isolated by color. 
It was predicted that when stimulus 
similarity is high, rate of PA learning will 
he faster for the S-R pair in which the 
stimulus term is isolated than for S-R 
paits in which the stimulus terms are not 
isolated; when stimulus similarity is mod- 


| erate or low, there will be no difference in 


tate of PA learning between an S-R pair 
having an isolated stimulus term and S-R 
Pairs having nonisolated stimulus terms. 
During learning trials an isolated stimulus 
term was presented in red, while the non- 
Kolated stimulus terms were presented in 
lack. After each five learning trials, a 
transfer test. was given to determine if Ss 
Were able to give the correct response when 
a formerly isolated (red) stimulus term 
-™s presented in black. 


Exprrment I 


Method 


Subjects, Sixty first-grade Ss with no reading 
Moblems were used. Twenty Ss were randomly 
ied to PA List 4-L, 20 to List 6-L, and 20 
i List 8-L. ‘The Ss were then randomly ‘assigned 

TowWs in the design. 

q esign. A 4 X 4 yepeated-measures Latin- 
niuare design was used, one for each of the three 
Fi lists. As seen in Figure 1, the first column 
Phe design contained the words which were 
‘“olted by printing them in red. Columns 2, 3, 
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Fi. 1. Paradigm of 4 X 4 repeated-measures 
Latin-square design. (Each S learns PAs for words 
in row to which he is assigned. 


and 4 contained the nonisolated words which 
were printed in black, Each row in the design 
contained a different word which was isolated in 
red. The order of presentation for the words in 
that row was randomized for each trial. It is im- 
portant to note that each S was required to learn 
Tesponses for isolated and nonisolated stimuli for 
the row in the design to which he was assigned. 
Thus, this design permitted comparisons to be 
made of PA learning rate as well as for transfer 
of isolated and nonisolated words presented to the 


same 8. 


Paired-Associate Lists 


An artificial alphabet was used that had as lit- 
tle resemblance to English letters as possible. 
From the alphabet three lists of two-letter words 
were constructed (see Figure 2), List 4-L (high 
stimulus similarity) had four two-letter words 
constructed from only four different letters. List 
6-L (medium stimulus similarity) had four two- 
letter words constructed from six different let- 
ters. List 8-L (low stimulus similarity) had four 
two-letter words constructed from eight different, 
letters. The same response words were used for 
the three lists. They were: toy, dog, cat, man. 

For the learning trials, three of the words in 
each list were printed i in black, while the fourth 
was isolated by printing it in red. For the trans- 
fer trials, all the stimuli were printed in black. 
The stimuli were printed on 5 X 8 inch index 


cards. 
Group 4-L Group 6-L Group 8+L 

Words Pronounced Words Pronounced Words Pronounced 
a - ee 
* we mae 
Mir Cat 2% Cat 3 A Man 
oa Man ait Toy 4 Dog 


ia. 2. Stimuli and responses for 
Groups 4-L, 6-L, and 8-L. 
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Procedure 

The E worked individually with Ss. The PA 
anticipation procedure was used. When S was 
shown a stimulus card, he was allowed an approx- 
imate 4-second interval to respond before E gave 
the correct response. On the learning trials, if S 
gave the correct response, H immediately echoed 
the correct response and presented the next stimu- 
lus card, If S answered incorrectly or gave no re- 
sponse, HZ gave the correct response and had S 
repeat it before presenting the next stimulus. 

After each five learning trials, a transfer test 
was given. During the transfer tests feedback was 
not given. The S was given a total of 15 learn- 
ing trials and three transfer tests during the 
course of the experiment. For each of the three 
lists the stimuli were presented in the same ran- 
dom orders on the learning and _ transfer-test 
trials, 


Results and Discussion 


Table 1 shows the means and standard 
deviations for Lists 4-L, 6-L, and 8-L on 
learning trials and transfer tests. The 1-R 
in Table 1 refers to the words isolated in 
red in Column 1, while 2-B, 3-B, and 4-B 
refer to the nonisolated words printed in 
black in Columns 2, 3, and 4. On the trans- 
fer tests all the words were printed in 
black. 

Learning. Analysis of variance for re- 
peated-measures Latin-square designs were 
computed on correct responses given during 
learning for Lists 4-L, 6-L, and 8-L. The 
treatment effects of red versus black words 
were significant in List 4-L (F = 10.87, 
df =3/48, p< 001), not significant in 6-1, 


TABLE 1 


8. Jay Samvets 


(F = 1.75, df = 3/48, ns), and not sig- 
nificant in List 8-L (F = 1.22, df = 3/48 
ns). 
In order to determine for the learning 
trials if the mean number of correct re. 
sponses given to the isolated words was 
significantly different from the mean num- 
ber of correct responses given to the non- 
isolated words, planned orthogonal com- 
parisons were computed for Lists 4-L, 6-L, 
and 8-L. On high stimulus similarity List 
4-L the differences between the means for * 
isolated and nonisolated words were signifi- 
cant (F = 30.79, df = 1/48, p < .001). On 
medium stimulus similarity List 6-L the 
differences between the means for isolated 
and nonisolated words were not significant 
(F = 3.76, df = 1/48, ns). On low stimulus 
similarity List 8-L the differences between 
the means of the isolated and nonisolated 
words were not significant (F = 3.06, df = 
1/48, ns). 
Transfer. While significant differences 
during learning in number of correct re- 
sponses were found in one of the three lists 
favoring the isolated over the nonisolated 
words, the critical issue was one of trans- 
fer or ability to give the correct response 
when the word formerly in red was pre- 
sented in black. To compare number of 
correct responses for words which were 
presented in red during learning and in 
black during transfer with words presented 
always in black, ¢ tests for correlated 


: 


Mzans anv Sranparp Deviations ror Correct REsPoNses DURING LEARNING AND 


Tr. 


ANSFER Reportep By CoLuMNS WITHIN SIMILARITY GROUPS 


Stimulus similarity 
Item ee 6-L 
Cobynns Columns 
1-R 2B 3B 4-B LR 2B 3B 1B 
Learning a 
uM 9.85 | 5.90 | 4.70 | 5.75 | 10.7 10.45 | 10. 
20 1.4.70 | 5. -75 | 9.80 | 9.50 | 8.70 | 11.90 | 10.00 | 10. : 
SD 4.73 | 3.13 | 2.98 7 9 | 3.55| 37 
Transfer 3.58 | 3.19 | 3.24 | 3.79 | 3.86 | 3.39| 3.3 e 
M -60/ 1.05 | .90| 1.15} 2.10 | 2 5 ‘ 5 | 2.45) * 
+90 | 1. +10 | 2.40 | 2.15 | 2.10 | 2.80] 2.55 

aly 89] 1.05| 79] 88] “'85| ‘ge | 1.04] 1.12] 1.08] .69| 9] © 


Note.—R = words in red, B = words in black. 
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samples were computed. On high stimulus 
similarity List 4-L the pooled mean for 
number of correct responses given to the 
nonisolated words was significantly greater 
than the mean number given to the 
formerly isolated words (t = —2.67, df = 
19, p < .02, two-tailed). On medium stim- 
ulus similarity List 6-L the pooled mean 
for number of correct responses given to 
the nonisolated words was not significantly 
different from the mean for the formerly 
isolated words t < 1, df = 19, ns). The 
same results were found for low stimulus 
similarity List 4-L (t < 1, df = 19, ns). 
It would appear that in high similarity 
List 4-L, during learning trials, where dis- 
crimination on the basis of letter form was 
difficult, Ss responded to isolated words on 
the basis of color. On transfer tests the color 
cue was absent, and Ss did poorly with the 
formerly isolated words. On Lists 6-L and 
8-L, where discrimination on the basis of 
letter form was easier, color was a less po- 
tent cue. For these lists it appears that dur- 
ing learning Ss tended to use letter form 
€8 a cue for responding even for isolated 
words. One may find support for this in- 
 terpretation by noting that on transfer 
tests for Lists 6-L and 8-L the means for 
isolated and nonisolated words were not 
Significantly different from each other. 


Experiment II 


The failure in Experiment I to find posi- 
tive transfer for the S-R pair in which the 
stimulus term was printed in red may be 
attributed to several sources. In the first 
€xperiment only three transfer tests were 
Biven, It is conceivable that Ss did not 
have sufficient experience with the transfer 
task to realize that color was an irrelevant 
Sue and letter shape the primary cue. Con- 
Sequently, a pilot study was run with first 
graders using only a high-similarity list. 
The procedure was changed so that every 
earning trial was followed by a transfer 
test, Although this procedure provided 
ample opportunity for S to realize that for 
transfer tests color was an irrelevant di- 
Tension, the results were the same as in 
xperiment I (i.e., facilitation for the iso- 
ated pair during learning and negative 
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transfer on tests where color cues were re- 
moved). Another explanation for failure to 
find positive transfer is that the children 
lacked sophisticated learning strategies 
and were unable to focus attention upon 
the less salient but critical dimension of 
letter shape. It is possible that college Ss, 
with sophisticated strategies for learning, 
are able to focus attention upon the critical 
dimension of letter shape in the presence of 
the more salient dimension of color. 

The purpose of Experiment II was to 
determine whether it would be possible to 
facilitate PA learning and transfer by (a) 
isolating with color one of the stimulus 
terms, (b) providing enough transfer tests 
for S to realize that in order to respond 
correctly at transfer he would have to 
focus on letter shape during learning, and 
(c) using Ss with sophisticated learning 
strategies. 


Method 


Subjects, Thirty Ss enrolled in introductory 
educational psychology were used. 

Design. A repeated-measures design was em- 
ployed in which each S learned one PA list con- 
taining S-R terms representing two conditions. In 
one condition during learning trials both terms of 
the S-R pair were printed in black. In the second 
condition during learning trials the S term of a 
pair was printed in red, while the R term was 
printed in black. On tests of transfer, S terms of 
both conditions were printed in black. 

Materials. A high stimulus similarity list was 
used. The S-R pairs were: xyz-money, xar-jewel, 
xaz-kitchen, var-dinner, zyx-office, vur-garment, 
zor-wagon, zov-village, vyx-heaven, rvy¥-insect, 
The list during learning trials contained nine 
S-R pairs which were printed in black letters, 
while in the tenth pair the 8S term was isolated 
by printing it in red and the R term was printed 
in black. For transfer-test trials all 10 S terms 
were printed in black, and no R terms were shown. 
Ten different PA lists were made. In each list a 
different S term was isolated in red. The lists 
were presented with a Lafayette memory drum. 


Procedure 


Three Ss were assigned to each of the 10 PA 
lists according to order of appearance. The Ss 
were run individually. Learning and transfer 
trials were alternated so that each S$ received 
seven learning and seven transfer tests. For the 
learning trials standard PA anticipation procedure 
was used. During the anticipation interval the 
S terms were exposed for 2 seconds followed by 
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exposure of the S-R terms together for an addi- 
tional 2 seconds. 

For the transfer tests each S term was presented 
alone for 4 seconds. No R-term feedback was pro- 
vided during transfer tests. An 8-second intertrial 
interval was used between learning and transfer 
trials. The S was told that the same nonsense 
syllables would appear in different orders for 
learning and transfer, and he was to give the 
correct response associated with the stimulus as 
soon as he was able. 


Results 


Correct responses to the stimulus in red 
were compared to correct responses to the 
stimuli in black. Since there was but one 
stimulus in red and nine stimuli in black, 
proportions of correct responses to red and 
black over seven trials were calculated. 
Two matched-pairs ¢ tests were computed, 
one for learning and one for transfer. 

During learning the mean proportion of 
correct responses to stimuli in red was .59 
(SD = .33) and to black, .32 (SD = .13). 
This difference was significant (¢ = 4.23, 
df = 29, p < .001, one-tailed). 

During transfer the mean Proportion of 
correct. responses given to the stimuli 
printed always in black was .42 (SD = -15) 
and to the stimuli printed in red during 
learning and black during transfer, .27 
(SD = 32). This difference was signifi- 
cant (¢ = —3.41, df = 29, p < .002, two- 
tailed). 


Discussion 


Experiment II was conducted to de- 
termine if in a high stimulus similarity PA 
list the learning and transfer of an S-R 
pair could be facilitated by isolating with 
color the § term of the pair. The procedure 
employed alternating learning and transfer 
trials and using Ss with sophisticated learn- 
ing strategies. The results disclosed that 
during learning trials, when color cues 
were present, significantly more correct 
responses were given to the isolated than 
to the nonisolated § terms. During trans- 
fer, when color cues were removed, signi- 
ficantly more correct Tesponses were given 
to the nonisolated § terms, Correct, re- 
sponses to isolated S terms on learning 
trials and failure to do so at transfer in- 
dicated that Ss were attending to stimulus 
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color and not stimulus shape, Although 
color was a relevant dimension during 
learning trials, it was an irrelevant dimen. 
sion during transfer. The fact that §% 
focused primarily on the irrelevant dimen- 
sion of color rather than the relevant di- 
mension of shape for the isolated S-R pair 
is surprising first, because the number of 
alternations between learning and transfer 
trials was sufficient to indicate that color 
cues were being removed at transfer, and 
second, because Ss were told that the same 
nonsense syllables would appear at learn- 
ing and at transfer. It thus appears that 
when college Ss learn a high stimulus simi- 
larity PA list in which an § term has been 
isolated by printing it in color they find it 
difficult to focus on a less salient, but criti- 
eal, cue in the presence of a more salient 
cue. In this respect the learning strategy 
of college Ss was similar to that of the 
children in Experiment I. 

Several conclusions may be drawn from 
the findings. The von Restorff effect seems 
to be a less universal phenomenon than 
previously thought; that is, there are 
conditions under which increasing the dis- 
tinctiveness of an item in a list does not 
facilitate learning. The von Restorff effect 
in PA learning seems to be reliably found 
in lists of high stimulus similarity where 
discriminability among items is poor. BY 
isolating a stimulus in a high-similarity 
list, that item becomes distinctive in con- 
trast to the other smtiuli, and the isolated 
pair is learned more rapidly than the non- 
isolated pairs. The von Restorfi effect 18 


not reliably produced in low stimulus simi 


larity PA lists where the items are alrent? 
highly discriminable from each other. é 
qualifying factor to this conclusion may 
list length. If a low-similarity list is i 
and difficult to learn because of its length, 
it is possible that in this list isolation ta 
facilitate learning. Thus, stimulus ie: 
ity, isolation, and list length may ee 
Another finding of interest relates % i 
difficulty Ss had in shifting attention jee 
a more salient but irrelevant cue to ie 
salient but relevant cue. A similar pie 
was reported in a study on ang ‘ in 
processes in reading (Samuels, 1967)- 


oO 


t 
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this study children had to learn PAs. Dur- 
ing learning trials the stimulus contained 
both a picture which could reliably elicit 
the correct response and letters which 
spelled the word. On transfer trials only 
letter stimuli were presented. Although 
learning and transfer trials were alter- 
nated, on learning trials Ss attended to the 
cue which most reliably elicited the cor- 
rect response (i.e., pictures) and conse- 
quently did poorly on transfer tests when 
picture cues were omitted. The conclusion 
one may reach is that in verbal learning, 
when the stimulus complex has several 
dimensions upon which S may focus at- 
tention, the principle of least effort op- 
erates (Underwood, 1963), and S tends to 
focus upon that dimension of the stimulus 
which most reliably elicits a response lead- 
ing to reinforcement. 

The tendency for Ss to focus upon the 
stimulus dimension which most reliably 
elicits the correct response, even though 
the stimulus dimension may be irrelevant 
in terms of transfer, suggests that educa- 
tional innovations which attempt to facili- 
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tate learning by using incidental cues must 
be evaluated in terms of transfer when the 
incidental cues are removed. 
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STRATEGY SELECTION AND INFORMATION PROCESSING ~ 
IN HUMAN DISCRIMINATION LEARNING? 


L. BERELL KORNREICH 
University of Wisconsin, Milwaukee 


In 2 experiments, adult human Ss received 32 4-dimensional dis- 
crimination problems. A method which purported to tap all the 
hypotheses (Hs) was compared with Levine’s (1966) blank-trial 
procedure, which taps only 1 H. No difference in S’s behavior under 
these 2 procedures was found. Each S’s strategy was inferred from 
the pattern of Hs manifested, and it was found that S’s strategy con- 
tributed a large amount of the variance. Separate analyses for dif- 
ferent strategies were performed. It was pointed out that whether 
outcomes are preprogrammed or contingent upon S's choice response, 
the effect of outcome, that is, whether “right” or “wrong,” is ob- 

analyzing the problems solved. An analysis was 
performed which looked at information-processing errors after each 
outcome. It was found that most errors occur on Trial 2 of the indi- 
vidual problem and that significantly more errors are made after 
“wrong” than “right” outcomes on both Trials 1 and 2. However, the 
difference is not large considering the number of trials involved. 


secured by merely 


Levine (1966) introduced a theory of hu- 
man discrimination learning which treated 
S as an information processor and ana- 
lyzer. He proposed that Ss operate under a 
strategy almost identical to Bruner’s (Bru- 
ner, Goodnow, & Austin, 1956) “focusing” 
strategy. It is assumed that Ss attempt to 
remember (encode) all the cues which logi- 
cally could be correct after an outcome 
(eg., “right” or “wrong”). These cues are 
then stored in memory as “hypotheses” 
(Hs) and are tested against future out- 
comes. In this way, Hs are eliminated from 
the retained set until one H remains as the 
“correct answer” to the problem. 

A methodology was employed whereby 
the set of possible Hs was determined by 
the experiment, and one H that S was hold- 
ing after each trial could be inferred. 
This inference permitted unambiguous pre- 
diction of the response on outcome trials 
after the first one. The nature of the out- 
come, whether right or wrong, was con- 
trolled so that the effect of outcome on the 
retention or rejection of the H being held 
could be analyzed. 

Levine’s procedure for inferring one H 

*This article is based on a dissertation sub- 
mitted to the Graduate School, Indiana University, 
in partial fulfillment of the requirements for the 

PhD degree. The author wishes to thank his 
advisor, Richard Young, for his help and en- 
couragement throughout the dissertation project. 
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being held involves presenting four “ 
(no outcome) trials. There are 16 poss y 
sequences of S’s four choice responses. TI 
stimuli were constructed so that ¢€ 
sequence patterns conform to the ¢ 
possible Hs as defined by HZ and told 
Thus, if S keeps responding on the basi 
a single H, that H manifests itself in a 
tinguishable sequence over the four blank 
trials. 

Levine found that Ss were very sys 
atic in this respect, and that they 
tinued to respond on the basis of an | 
until a wrong outcome was received. The 
they chose another H. An analysis of 
new Hs chosen revealed that Ss had to 
holding several Hs at one time, and tha 
they also eliminated several at one UN 
This finding gave rise to the formulation’ 
the “focusing” strategy. 4 

The “blank-trial” procedure allowe 
the inference of only one H, and the Hy 
sis of the grouped data revealed tha’ 
held several Hs at one time. ey q 
sis of grouped data further showed aa 
information-processing ability of, 7 
far from perfect. This was Se vy 
parent after wrong outcomes. So the que 
tion arises: What causes the a 
make errors? It could be that all Ss 7 
same strategy, that aoe ing th 
faulty encoding at some point dur 
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problem, and that Ss differ in their ability 
to encode. Levine’s theory makes such an 
assumption. It could also be that Ss differ 
in the strategy used, so that some errors 
are due to inferior strategies, as well as 
faulty implementation of the focusing 
strategy. In any event, a procedure which 
taps all the Hs being considered by S 
would shed light on the question of how er- 
rors occur. It would also be a direct test 
to Levine’s implicit assumption that all Ss 
use the same or a similar strategy. 

It was decided that a choice-response 
procedure would be interpolated between 
outcome trials instead of the four blank 
trials used by Levine. In the present experi- 
ment, Ss were faced with eight buttons and 
asked to indicate which Hs “still could be 
correct” after each of the outcomes. It 
was hoped that such a procedure would tap 
all the Hs being considered by S after each 
outcome without radically changing the 
task used by Levine. In order to check on 
the possibility that the task had been 
altered, groups of Ss were run under Le- 
vine’s blank-trial procedure in the first ex- 
periment. 


Experiment I? 


Method 


Subjects. The Ss were 60 undergraduate volun- 

teers at Central Connecticut College. 
_ Stimulus cards and problems. The discrimina- 
tion problems consisted of cards on which there 
Were drawn two stimuli about 14 inches apart. 
The stimuli varied in four dimensions—color, 
Size, letter, and position, Black and white were 
used as colors, X and T, as letters. A large letter 
_ Was 1 inch, a small letter, % inch in height. All 
the problems were composed of a series of such 
cards, with a blank card separating the problems. 

Two groups of 20 Ss received identical prob- 
Iems which incorporated Levine’s blank-trial 
Procedure. A problem was composed of 16 cards, 
4 outcome cards and 12 blank-trial cards. Out- 
Comes, either right or wrong were always given 
after Cards 1, 6, and 11 and were given half the 
time after Card 16. Figure 1 shows a diagram of 
the problem, 

These four cards form a set with several prop- 
erties. Each value of each dimension is combined 
ee twice with the values of all ae caer 
Qmensions. The set provides that, after the 


*The author wishes to acknowledge the assist- 
nce of Frederick Karls in the collection of the 
data for Experiment I. 
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outcome, four of the eight cucs remain as logically 
possible solutions; after the second outcome, two 
Temain; and, after the third outcome, the solu- 
tion is logically determined, This is true whether 
the outcomes are right or wrong and allows £ to 
Program all outcomes for S. It also follows that S 
always has only a 50% chance of choosing the 
correct stimulus on the first three outcome cards, 
Such a set is also used as blank trials and insures 
that eight of the 16 possible sequences of choices 
ie the four cards correspond to the eight cues or 

The four cards used as outcome cards are called 
Set A. After the first three outcome cards, four 
blank-trial cards were presented which were also 
orthogonal because they were merely the same 
stimuli reversed on the card. These four cards are 
called Set B. Thus, instead of the large-black-X 
being on the left, it appeared on the right, ete., 
for all four cards, The four blank trials were used 
in order to infer §’s H after the first three out~ 
comes. The entire problem is presented in diagram 
form in Figure 2. 

Design and procedure. The major comparison 
in this experiment is between the blank-trial 
method used by Levine to indicate one H being 
held by S after each outcome, and a method 
which requests § to indicate all Hs being held 
after each outcome. Therefore, a group run under 
each method was required. A third group was 
added run under Levine’s method, but with the 
apparatus of the other method present, in order 
to assess the effect of having the eight possible Hs 
on display for S. 

All three groups received the same outcome 
cards, and Ss were assigned at random to one of 
the three groups. The Ss run under Levine’s blank- 
trial procedure were given 16 such problems; Ss 
in Group 2 repeated these 16 problems with dif- 
ferent outcomes and so received 82 problems in 
all. A deck of stimulus cards was prepared which 
made up eight problems, A random order was used 
for both outcome (Set A) and blank-trial (Set, B) 
cards. The deck was merely turned back to the 
beginning, and the cards were presented again for 
the remaining problems. The Ss were told that 
the same solutions would not necessarily be cor- 
rect, even though the cards were being presented 
a second time. Orders of cards and the written 
order of the Hs on the apparatus were changed 
after five Ss had been run in each group. Thus, 
four orders were used, since each group contained 
20 Ss. 

Another important part of the design relates 
to the orthogonality of the stimulus-card sets, 
Levine programmed the outcomes, whether right 
or wrong, for each problem. This was possible 
since either outcome is “logical” until the fourth 
outcome. He found that no S indicated doubt that 
the problems had predetermined solutions. 

Each sequence of three outcomes was presented 
twice over the 16 problems. That is, the sequence 
of right-wrong-wrong was presented twice, etc., 


STIMULUS 
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LEFT SMALL 


WHITE T 


SETA 
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Fic. 1. A set of four stimulus cards and the eight sequence patterns 
produced when one hypothesis is followed by S over the four trials. 


for the other seven sequences. This procedure was 
followed for all three groups. For Group 2 each 
Sequence was again presented twice during the 
second 16 problems, only in a different order. 

The programming of outcomes allowed for 
“systematic testing of the differential functioning 
of right and wrong. Yet, a problem can still be 
scored as solved if the last H used by S is con- 
sistent with the “information” contained in the 
first three outcome trials. Thus, if S chose the black 
stimulus and was told “wrong” on the first three 
outcome trials, the white cue can be regarded as 
the correct solution. If S evidenced a white H 
after the third outcome trial, he can be said to 
have solved the problem. 

Thus the design consisted of three groups of 20 
Ss each. Groups 1 and 1A were nearly identical, 
in that each was run under the blank-trial proce- 
dure. The only difference between them was that 
Group 1A had the wooden apparatus with the 
eight Hs written on it together with additional 
instructions commenting upon the presence of the 
apparatus. 

Each S was instructed and run individually, 
The instructions used for the 40 blank-trial Ss were 
nearly identical to those used by Levine, The 
only difference was due to giving Ss only two 
instead of four practice problems. The first prac- 
tice problem consisted of 10 trials in which the 
color (black) was the basis for solution. The § 
received the deck face up. He Tesponded to the 
top card, the appropriate outcome was given, 


and he then turned the card face down out of the 
way. This procedure was followed throughout iy 
experiment. The second practice problem on 
sisted of 46 trials with an outcome given Ce 
first trial and at every fifth trial thereafter. ee 
left position was the basis for solution. In sl 
practice problems, nonsolving Ss were given ve te 
tive instructions, redid the problem, and contin' 
on. 
After completing the practice problems a 
received the same instructions. They wee 
that “the problems will all be like the one y! ae 
just had, always with one of these snp a 
tions, That is, one of the colors, sizes, en sa 4 
positions will be correct.” They were ie 
that “a problem has just ended and a new 
beginning when you turn over a blank oat cal 
Group 1A received the following 
instructions: NPs 
The eight possible answers are on oe 
there, so that you don’t have to membnits Ty, 
Remember, only one of these ae nats yout 
answers is correct for each ets 
job to find out which one is correct. 
Gesas 2 received the following ad 
te fea- 
ao that is clear, I’d like to add eat it in 
ture. You will notice that this Sutton has 
front of you has eight buttons. Eac borat 
one of the possible answers write! a oe 
Remember, only one of these ae Et ets 
answers is correct for each problem; : 


ditional in- 
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_ Fic. 2. A diagram of the 16-trial problem show- 
ing the outcome (right or wrong) trials and the 
blank (nothing said) trials from which Ss’ hy- 
potheses (Hs) were inferred. 


consistently correct for that problem, It is 
your job to find out which one of these eight 
is the correct one for that problem. After I say 
“tight” or “wrong,” you'll press one or more 
buttons to indicate which of the answers you 
as still could be correct. Any questions about 
a 
The apparatus used for Group 2 was a com- 
Pletely mechanical wooden box (13 inches wide X 
8 inches long). It had a panel with eight buttons 
on it which faced S..Above each button was written 
one of the eight Hs in the following manner: x, 
T, large, small, white, black, left, right. The two 
levels of one dimension were always kept to- 
gether. When a button was depressed by S it 
tipped over a small wood block on a hinge inside 
the box. Since the back of the box was open, the 
tipped blocks were visible to Z but not to S, and 
since the button returned to its former position 
when released, no visible record of the buttons 
Pressed remained for S. The E kept a record of 
the buttons pressed. The blocks were then reset 
after each trial to the upright position. 


Results and Discussion 

The first question to be considered is 
Whether the introduction of a visual repre- 
Sentation of the eight Hs and the inter- 
Polation of a button-pushing task radi- 
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cally changed the task used for Group 1. 
In order to answer this question, the num- 
ber of correct-choice responses to the last 
stimulus cards of the problems was com- 
pared among the three groups. Out of 320 
problems, Groups 1, 1A, and 2 made 68%, 
72%, and 75% correct-choice responses, 
respectively. An analysis of variance for 
these differences revealed no significance, 
F = 1,19, df = 2/48. Neither the effect of 
the order of Hs on display nor the interac- 
tion of order and groups was statistically 
significant. Since there is no significant dif- 
ference between groups, it can be assumed 
that the problem-solving task is the same 
for all groups, and that a visual record of 
the eight Hs does not improve problem solv- 
ing after instructions define the eight Hs for 
S. For the rest of the analyses, the data 
from Groups 1 and 1A (blank-trial groups) 
will be combined. 

In this experiment, the stimuli were all 
black and white, and X and T. Levine 
used different colors and letters. A compar- 
ison of data from the blank-trial groups in 
this experiment with Levine’s data reveals 
that the change in stimuli and Ss had not 
greatly changed S behavior. Levine found 
that 92.4% of the blank-trial patterns cor- 
responded to an H, whereas the corre- 
sponding figure for this experiment is 89.9%. 
Levine found that when a right was given, 
the same H was retained 95% of the time 
(based on interpretable patterns). When a 
wrong was given, a different H was selected 
98% of the time. The corresponding fig- 
ures for Groups 1 and 1A are 91% and 96%, 
respectively. So, although the present data: 
are slightly less systematic than Levine's, 
his type of “H” analysis is applicable. 

The Ss in Group 2 produced two sets of 
data; one set indicated their choice re- 
sponses on the four stimulus cards, and the 
other set indicated their choice of Hs after 
each outcome had been given. The purpose 
of having Ss indicate all the Hs they were 
holding after each outcome was to test 
Levine’s (1966) theory. He proposed that 
an attempt is made to “focus” from trial 
to trial on the basis of the outcomes (in- 
formation) given. For these problems, Ss 
who operate under a focusing strategy 
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should hold four Hs after the first outcome, 
two Hs after the second outcome, and one 
H after the third outcome. Other strategies 
may be manifested by a different pattern 
of responding. 

An analysis of each S’s strategy was 
performed for the 20 Group 2 Ss. A focus- 
ing strategy was operationally defined as 
4-2-1 response pattern for three of four 
successive problems, which is the pattern 
such a strategy would be expected to 
produce. By this criterion, 14 of the 20 Ss 
became focusers by the fifth problem; nine 
Ss began as focusers, All 14 Ss maintained 
that strategy for the rest of the 32 prob- 
lems, These 14 Ss will be treated as a 
group, and all 32 problems per S will be 
included in the remaining analyses. 

Of the remaining six Ss, three maintained 
nonfocusing strategies for the entire 32 
problems. The other three adopted a 
focusing strategy at the twelfth, four- 
teenth, and twenty-third problem, respec- 
tively, and maintained a focusing strategy 
for the rest of the 32 problems. Thus, a 
“focusing strategy” may be conceived as 
an internal state that S may or may not 
enter. Entrance into this state seems to be 
an end state for S. A group of nonfocusers 
was formed from these six Ss by including 
the first 16 problems of the three Ss who 
eventually became focusers. The first 16 
problems were included rather than just 
those problems before which they became 
focusers in order to preserve an equal 
number of types of problems. For example, 
there were two right-right-right and two 
Wrong-wrong-wrong problems among the 
16 problems. 

Four of the six nonfocusers indicated 
four Hs after each trial. These four Hs 
represented the four cues of the inferred- 
correct stimulus on the just previous out- 
come card. That is, if they had chosen the 
small-black-X on the left and had been 
told “wrong,” they would press four but- 
tons indicating the large-white-T and right 
Hs, regardless of the outcome trial, An- 
other S responded similarly except that he 
usually indicated only three of the four 
inferred-correct Hs. These Ss could be 
called “describers,” in that they merely 
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described the possible correct cues 6 
stimulus card. Thus, they failed fo 
nate Hs from card to card. The rej 
S only eliminated one or two Hs 0 
card instead of the four on each ¢g 
the “describers.” 

Levine’s assumption, then, th 
focusers is true for the majority 0 
However, 3 of 20 Ss never manifested 
a strategy. In fact, 11 of 20 Ss did no 
come focusers until after they had do 
least a few problems. Do these” 
focusing Ss contribute choice-response 
which are different from those of 
cusers? In order to answer that qi 
the choice responses for the last 
card were analyzed separately for f 
and nonfocusers. By the last car 
solution in terms of a single cue he 


cusing should have narrowed down 
set to the correct H. If the i 
strategy reflects accurately the 
process of S, nonfocusers should mi 
narrowed down the correct H and 80 
perform at an inferior level compare 
focusers, in terms of making the 
choice response on the last card. 

In order to investigate whether 
cusers corresponded to “good” pro! 
solvers from Groups 1 and 1A (bla 
groups) and whether nonfocusers 
sponded to “bad” problem solvers, g 
bad groups of Ss were formed from: 
1 and 1A Ss. “Good” Ss were defin 
those Ss who manifested the correct 
at least 11 of the 16 problems. “B 
were those who manifested only from 
to seven correct Hs of the 16 prop 
There were 13 good Ss and 16 bad Ss. 
Groups 1 and 1A. 

Table 1 presents the data for 
newly created groups. The upper two 
compare the focusers and good 8s 
data give strong support to the noe 
good problem solvers under either ab 
trial or a button-pushing procedure 
good precisely because they are 
focusing strategy. The lower two rOwe 
pare the bad Ss and the nonfocusers: 
comparison strongly supports the 
tion that some Ss are poor problem 
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whether under a blank-trial or a button- 
pushing procedure, precisely because they 
do not employ a focusing strategy. 

In all groups, an effect related to wrongs 
is apparent. The more wrongs, the more in- 
correct-choice responses, but the effect of 
even one wrong is to reduce the nonfocusers 
and bad Ss to nearly a chance level. The 
effect for the good problem solvers is much 
more moderate. However, all groups do 
about equally well when the outcomes are 
all rights. It should be remembered, though, 
that these results are in terms of choice 
responses, whereas problem solution is 
really defined in terms of the correct H. 
The good and bad groups were formed on 
the basis of the number of correct Hs. An 
analysis of the groups in terms of the num- 
ber of correct H solutions, given zero, one, 
two, and three wrongs, should shed more 
light on the question of the effect of 
wrongs. 

Table 2 presents these data. The data 
for the good and bad Ss are based only on 
the interpretable H patterns. The data for 


TABLE 1 
Purcentacus or Correct-Cxorce RESPONSE FOR 
THE Last TRIAL OF THE PROBLEM FOR 
Four Sprctatty SELECTED GROUPS OF 
Supsects Given Zoro, Onz, Two, 
or Turee WRONGS IN THE 


PROBLEM 
Wrongs 

0 1 2 3 

eee a! erat ePiee Leo b 
Good Ss (from Groups | 96 | 88 | 79 | 7 
land 1A) 26 78 78 26 
Focusers (from Group | 93 | 86 | 79 | 68 
2)> 56 | 168 | 168 56 
Bad Ss (from Groups| 91 | 58 | 51 | 83 
land 1A)° 32 | 96 | 96 | 32 
Nonfocusers (from | 94 | 59 | 59 | 61 
Group 2)4 18 | 54 | 54 | 18 


Note—The lower numbers indicate the n for 
that percentage, and n refers to the number of 
Problems. Thus, the 26 means that there was a 
total of 26 right-right-right problems which were 
Scored for the good Ss. The response to the last 
aye correct for 96% of those 26 problems. 

° iS. 

*14 Ss. 

°16 Ss. 

16 8s. 
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TABLE 2 


PERCENTAGES OF PROBLEM SOLUTION IN TERMS OF 
Correct Hyporuesus ror Taree SpEctALLY 
SzLecrep Groups or Supsects GryEn 
Zzro, Onz, Two, on Tores Wrones 
IN THE PROBLEM 


Wrongs 


Good Ss (from Groups | 100 86 75 72 


land 1A)* 25 | 76 | 76 | 25 
Focusers (from Group | 91 87 78 65 
2)» 54 | 154 | 148 52 
Bad Ss (from Groups | 84 52 | 31 17 
1 and 1A)¢ 31 79 81 29 


Note.—The lower numbers indicate the n for 
that percentage, and n refers to the number of 
problems. It was not possible to determine the 
percentages for the 6 nonfocuser Ss from Group 2. 

“13 Ss. 

>14 Ss. 

°16 Ss. 


the focusers are based on those problems 
on which only one button was pushed after 
the third outcome. Since the nonfocusers 
always pressed more than one button, it 
was impossible to score their problems for 
the number of correct solutions in terms of 
the correct H. Again, the correspondence 
between the good Ss and focusers is strik- 
ing. The effect of wrongs for them is again 
to reduce the number of problem solutions 
moderately. The effect of wrongs for bad 
Ss is drastic. 

Levine grouped all Ss before he analyzed 
the data for the effect of right and wrong 
outcomes. From the data presented here, it 
can be seen how that procedure would 
produce a misleading estimate of the ef- 
fect of wrongs. He found a striking effect 
for wrongs which may have been con- 
tributed mainly by nonfocusing Ss. Instead, 
Levine assumed that a focusing strategy 
produced the effect of wrongs, and he pro- 
posed an explanation of the effect in terms 
of an assumed coding procedure used by 
focusers. It was assumed that S codes the 
cues of the stimulus he chooses. Therefore, 
when faced with a wrong, S has to “erase” 
the just previously coded information and 
recode the new information. For instance, 
consider that the Hs large-black-T-right 
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are still possibly correct after the first trial. 
The S chooses a large-right stimulus on the 
second trial and is told wrong. According 
to Levine, S must erase “large” and “right” 
and find “black” and “T.” The advantage 
to S of receiving rights is that there is no 
necessity to recode. 

Such an explanation, of course, does not 
speak to the data of the nonfocusing Ss. 
Does such an explanation even apply to the 
data from focusers? Before an answer to 
that question can be given, an important 
artifact of the procedure must be discussed. 
In order to study the effect of right and 
wrong outcomes, Levine predetermined the 
schedule of outcomes for each problem. A 
problem still had a solution, but it was 
defined as a function of S’s choice re- 
sponses and the outcomes given. For ex- 
ample, if S happened to choose the left 
side on all three outcome trials of a prob- 
lem designated as a wrong-wrong-wrong 
problem, the solution became defined as 
“right” for that S on that problem. If S 
manifested a “right” H after the third out- 
come, he was credited with solution of the 
problem. Thus, S always defined the solu- 
tion for himself. This procedure was used 
in the present experiment. 

Although such a procedure looks like a 
clever way to control the outcome variable 
so as to better study its effect, just the op- 
posite results. Consider an S who remem- 
bers only one of the four correct Hs after 
the first trial. If two rights occur on the 
second and third outcome trials, he will 
“solve” the problem merely by responding 
consistently on those two trials. The same 
S, if confronted with a wrong on the second 
trial, will have to abandon his one stored 
outcome. He can only infer that the cor- 
rect H is one of the four represented by the 
other stimulus, the position of Ss after the 
first trial. Thus, an S who has “learned” 
exactly the same thing as another, will not 
be credited with solving the problem be- 
cause wrongs happened to occur in that 
problem. Similar contingencies exist for all 
the combinations of Hs held and outcomes, 

The high percentage of right-right-right 
problems solved by bad Ss can now be ex- 
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plained. These Ss merely had to respond 
consistently on the basis of one correct H 
after the first trial. The rights would serve 
to define that H as the correct solution, 
The effect of right and wrong outcomes 
cannot be analyzed in terms of problems 
solved due to the artifact caused by pre- 
determining the outcomes. Before an appro- 
priate analysis is presented, Experiment II 
will be introduced. Then the analysis will 
use the combined data of Experiments I 
and II. 


Experiment II 


Experiment II Ss were run under condi- 
tions similar to Group 2 Ss in Experiment 
I. Again Ss received 32 problems and were 
asked to indicate between choice trials 
which “answers still could be correct.” 
However, a major change involved present- 
ing “real” rather than “preprogrammed” 
outcomes. That is, a solution was selected 
for a problem (i.e., “black”), and outcomes 
were presented depending upon S's choice 
response, rather than the program set by 
E as in Experiment I. 


Method 


Subjects. Thirty-four student-nurse volunteers 
who were taking their psychiatric training at 
Connecticut Valley Hospital, Middletown, Con- 
necticut, served as Ss, 

Stimulus cards and problems. The problems 
were the same as those used in Experiment 1 
except that real outcomes were presented. The 
same order of cues appeared on the wooden appa- 
ratus for all Ss. 

Procedure. The instructions were shortened Be 
eliminating the two practice problems. A paragraP 
was added at the end which restricted the mut 
of buttons pressed to only one after the thir 
and fourth cards, This change in procedure Ney 
made so that the data for nonfocusing Ss could be 
analyzed to determine what H they were follow 
ing in their choice responses. Richter (1965) i 
reported choice-response data for bad Ss a 
were below the 5 guessing probability for pro 4 
lems with two and three wrong outcomes. It ae 
hoped that knowledge of the H such bad Ss ee 
using to make choice responses would help 
explain their below-chance responding. 


Results and Discussion 


A group of bad Ss was selected in ae 
of their choice responses on the four 


trial. Twleve Ss who made 13 or more er- 
_ rors constituted the group. The number of 
errors as a function of wrongs in the prob- 
lem was computed. Below-chance respond- 
ing was not observed for any problem type, 
80 that below-chance responding in Rich- 
ter’s group may have been due to his selec- 
tion procedure. 

Only 9 of the 34 Ss became focusers by 
the criterion of a 4-2-1 pattern of button 
tesponses for three of four problems in 
Succession. All but one of these Ss became 
a focuser before the ninth problem. Of the 
remaining Ss, 11 were describers, and 7 
were partial focusers, in that they either 
failed to hold all four Hs after the first 
trial or they failed to press four buttons. 
The other 7 Ss, as indicated by a post- 
experimental interview and inspection of 
their data, were either seeking a sequence 
solution or only eliminated one H when a 
Wrong was given. 

- Again it is clear that all Ss are not 
focusers. The number of focusers in a 
group may be dependent on the extensive- 
_ hess of the instructions and the number of 
practice problems or the population from 

_ which Ss are drawn. 
_ Can it then be said that nonfocusers “do 
ot understand” the instructions? Such a 
| Question forces a consideration of what it 

Would mean to understand. The avowed 
Purpose of the instructions for Levine and 
Richter was to restrict the H set to the 
ight, and to establish that a solution con- 
‘Sisted of a single H. If “understanding” 
Means to have learned this, then the only 
Ss who can be said not to have under- 
‘stood are the four Ss who looked for a solu- 
“tion in terms of sequences. These Ss held 
' Other than the eight Hs. 

Tn a sense, however, any strategy other 
than a focusing strategy for these simple 
four-trial problems indicates a lack of un- 

Gerstanding. Only focusing allows for a 
_ Solution” by the fourth trial in the sense 
that correct Hs arrived at by other strat- 
feles involve guessing or luck. The major 
: “learning” which takes place, then, is the 
formation of a strategy. The Ss can be 
Viewed as being in one of two states; they 
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either have learned or they have not 
learned with respect to the task set them, 


GeneraL Discussion 


The remaining analyses will consider 
only focusers, those Ss who really solved the 
discrimination problems. The effect on in- 
formation processing of right versus wrong 
outcomes can only be considered with re- 
spect to their data, since, for nonfocusers, 
successive trials do not carry information 
in a measurable sense. Table 3 shows a 
comparison between focusers from Experi- 
ments I and II. For the Experiment II 
group, only the problems after S became 
a focuser are included. The correspondence 
between the two groups is striking, Using 
real problems appears to have little effect. 
Again, there is a steady decrease in solu- 
tions as more wrongs serve as outcomes. 
However, before these data are taken as 
evidence for greater difficulty in informa- 
tion processing after wrongs, an artifact 
must again be considered. 

In Experiment II, the correct cue was 
determined by the H, and the outcomes 
were allowed to vary as a function of the 
solution and $’s choice. Consider an S who 
forgets the correct cue after the first trial. 
The probability of receiving a right on the 
second trial is less for him than for an S 
who also forgot one H, an incorrect one. 
This is because the correct H is no longer 
in his set of Hs, whereas, for the other S, 
his set of Hs has been reduced by one in- 
correct H. On the third trial, one § will 
receive a wrong, since he is only holding 
one H, an incorrect one, whereas the other 
S will at least have a 50% chance of re- 
ceiving a right if he still holds two Hs, the 
correct one and an incorrect one. In the 
extreme case, an S who happens to remem- 
ber only the correct cue by chance will re- 
ceive all rights if he merely chooses on the 
basis of this H. In this way, an informa- 
tion-processing error will lead to wrongs 
and an incorrect solution if the error in- 
volves the cue designated correct for that 
problem by #. An information-processing 
error which is just as significant from a psy- 
chological point of view will increase the 
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TABLE 3 
PERCENTAGES OF PROBLEM SOLUTION IN TERMS OF 
tHE Correct HyporHeEsis ror FocusERs 
FroM Experiments I anp II Givmn 
Zxro, OnE, Two, or THREE 
WRONG IN THE PROBLEM 


Wrongs 
Focusers 
0 1 2 3 
From Exp. Is 91 87 78 65 
54 | 154 | 148 | 62 
From Exp. II> 97 87 74 64 
30 | 89 | 89 | 25 


Note.—The lower numbers indicate the n for 
that percentage, and n refers to the number of 
problems, Thus, the 54 means that there was a 
total of 54 right-right-right problems which were 
scored for the focusers from Experiment I. The 
correct hypothesis was manifested after the third 
trial for 91% of these problems. 

914 Ss. 

+9 Ss. 


probability of rights and correct solution 
if the error does not involve the correct 
cue. For this reason, the data presented in 
Table 3 are not an appropriate measure of 
the effect of rights versus wrongs. Thus, 
whether one preprograms outcomes or al- 
lows them to vary, the effect of rights 
versus wrongs is obscured. 

In both Experiments I and II, an ap- 

propriate measure involves observing the 
entire set of Hs manifested after right and 
after wrong outcomes, After Trial 1, four 
Hs may be considered correct, in that 
complete information processing would en- 
code all four Hs embodied by the positive 
stimulus. After Trial 2, two Hs would be 
considered correct, and, after the third 
trial, one H remains as the solution, In 
order to analyze errors after the first trial 
then, the Hs manifested by S are compared 
with the four correct Hs. An error may in- 
volve getting three out of four, etc. This 
type of error is called an error of omission. 
Another type of error would consist of S's 
pressing five or more buttons. This is called 
an error of inclusion. 

The analysis for Trials 2 and 3 is the 
same except that fewer Hs are correct. The 
results of this analysis are presented in 
Table 4. It should be noted that errors 
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which occur on Trial 1 and are merely 
“carried” to Trial 2 do not count as errors 
on Trial 2. The same procedure also ap- 
plies for Trial 3. 

It can be seen that most errors involve 
forgetting one H of the correct set and 
either substituting an incorrect H for it or 
merely manifesting one less than the com- 
plete set. In considering only this type of 
error from both experiments, a Z test in- 
dicates a significant difference in errors af- 
ter right and wrong outcomes. For Trial 1, 
Z = 2.03, p < .05, and for Trial 2, Z = 
2.78, p < .01, which shows a greater prob- 
ability of a —1 type error after wrong than 
after right for Trials 1 and 2. 

Not every information-processing error 
results in an incorrect solution. In order to 
account for incorrect solutions, that is, an 
incorrect H after the third trial, a differ- 
ent analysis is needed. Only those problems 
on which incorrect solutions occurred are 
selected. Then each problem is evaluated 
to determine at what trial and whether af- 
ter a right or wrong the correct cue was 
dropped from the held set. Table 5 pre- 
sents these data for both Experiments I 
and II. The proportion of unsolved prob- 
lems is .18 for Experiment I and .16 for 
Experiment II. A focusing strategy 1s as- 
sociated with more than 80% solution of 
these simple problems. Both sets of data 
show the same pattern of errors. On Trial 
2, information-processing errors account 
for over half of the incorrect solutions. Also 
on this trial wrong outcomes appear to 
cause more incorrect solutions than right 
outcomes. A Z test on the combined data of 
Experiments I and II for Trial 2 indicates 
a difference between right and wrong out- 
comes significant beyond the .01 level. 
However, this difference is a small one 
considering the total number of problems. 

Levine assumed that a coding process 0 
the sort proposed recently by several au- 
thors (Glanzer & Clark, 1964; Haber, 
1964; Sperling, 1963) takes place. Levine 
further assumed that Ss code, that is, ba 
tempt to remember, the cues represen 
by the stimulus they choose before an ae 
come is given. In this way, he attempted ql 
account for what he took to be greater di 
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TABLE 4 


Noumper anD Types or INFoRMATION-PRocESSING ERRORS AFTER A Ricur 


(+) anp arrer a Wrone (—) 


For Trrats 1, 2, anp 3 


& Errors of omission 
Bey 2 is a ri Errors of inclusion % cars 
1 ih 
Exp. I 18+ 2—-}1+ 9-—}opa—Ji¢4—] 34 So" | arep sas Oo ong 
vonbae 
Exp. II 2+ 10— | 0+ 1-Jo+0-]1+2-] 0 1— | Exp. n=” — 07 
a» 234 
Exp. I 26+ 49— | 74+ 10— + 18 || xp I ee eo 
#31 
Exp. IT AY. 24 | B= 2+ 2— | Exp, 1— — 99 
234 
3° 
Exp. I + 16— 6 35) (Exp. T= 
a Se Sener nt 
Exp. II 6+ 6— eat rt dee) 
xp. I = 5 = 05 


| hypotheses. - 

® Four correct hypotheses. 

’ Two correct hypotheses. 

* One correct hypothesis. 

4 Errors of inclusion disallowed by procedure. 


culty in coding after wrong outcomes. If 
S codes the stimulus he chooses on the first 
trial before an outcome is given, the en- 
coded set remains the correct one if a right 
outcome is given, whereas the encoded set 
must be “erased” and its complement 
coded if a wrong outcome is presented. 

The reported data show that a small but 
statistically significant difference in coding 
| rors exists as a function of outcome on 

the first trial, The difference is explained 

by Levine’s coding hypothesis. The reason 
: for the small difference involves the pro- 
cedure used by Levine and in the reported 
€xperiments. Both the positive and nega- 
tive instances were presented simul- 
taneously. After an outcome, S controlled 
he exposure time since each S turned the 
card over at his own rate. Therefore, after 
4n outcome was presented, S could observe 
he inferred-correct stimulus for as long as 
he wished. It is proposed that Ss quickly 
lear to do that. In other words, Ss learn 
to encode the H set after and not before the 
-Sutcome. Before they learn, wrongs do pose 
Nore of a problem than rights. After learn- 


Note.—The —1, —2, —3 and —4 refer to an error of missing one, two, three, or all four of the correct 


ing, when told “wrong” after pointing to a 
stimulus on the first trial, S merely ob- 
serves the other stimulus on the card and 
codes those four Hs. Interviews with Ss 
after the experiment support such an in- 
terpretation. Apparently Ss can code the 
four Hs almost perfectly. 


TABLE 5 
InrorMATION-Procyssine Errors Wuicnw Rp- 
suLTED IN IncorrEcr Sonutions Pru- 
SENTED BY THE TRIAL ON WHICH 
Turey OccuRRED AND WHETHER 
AFTER A Ricut (+) oR WRONG 
(—) Ovrcomp ror Expuri- 
ments I anp IT 


Trial Exp. I Total errors | Exp. II | Total errors 
1 14+ 5—| 6 = .08/i+ 2—-] 3 = .08 
2 13+ 81—| 44 = .57/54+ 17—| 22 = .59 
3 11+ 16—| 27 = .35)64+ 6— | 12 = .33 

Ee 18) eat = .16 

Total 0 = * 934 


Note.—In Experiment I nine problems could 
not be scored since two hypotheses were mani- 
fested on the third trial. 
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A reliable, although again not a large, 
difference in information-processing errors 
as a function of outcome was observed af- 
ter the second trial. More errors are made 
after the second trial than either the first 
or third. An error is made after the second 
trial on about 25% of the problems, and it 
is these errors which lead to over half of the 
incorrect solutions. Why should this be 
true? The S has only to code two Hs rather 
than four, as after the first trial. The an- 
swer must lie with a perceptual factor. The 
S must remember the four correct Hs from 
Trial 1 as he chooses on Trial 2. If given a 
right, he must seek the two correct cues 
represented by the chosen stimulus. If 
given a wrong, he can either seek the two 
correct cues of the chosen stimulus and 
subtract them from the four encoded ones, 
or seek the two correct cues represented by 
the other stimulus on the card. The few 
more errors after wrongs is probably due to 
some Ss attempting to make subtractions. 
The greater number of errors after the 
second trial is probably due to forgetting 
one of the four cues once the first card is 
turned over. There is also perceptual inter- 
ference from the incorrect cues which are 
part of each stimulus. Few errors either 
after a right or wrong are made after Trial 
8. Apparently, when only two Hs need be 
remembered, the lack of perceptual aids is 
not critical. 
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From the two studies reported here, it 
can be seen that a large source of variance 
in the data was accounted for by the strat- 
egy which S used. Therefore, the previous 
procedure of grouping the data from all 8 
together led to inaccurate estimates of im- 
portant parameters. The failure to recog- 
nize the artifact introduced by prepro- 
gramming outcomes and by the guessing 
probability led to further inaccuracies. For 
these reasons, Levine (1966) and Richter 
(1965) emphasized the assertion that wrong 
outcomes cause more difficulty in informa- 
tion processing than right outcomes, rather 
than individual differences in Ss’ approach 
to the experimental task. 
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FREE-RECALL LEARNING 


JAMES H. CROUSE 
State University of New York at Binghamton* 


2 free-recall learning experiments were performed. In Experiment I 
the presence or absence of retrieval cues was varied factorially 
with the presence or absence of storage cues. Retrieval cues fa- 
cilitated recall when storage cues were present, but not when they 
were absent. Experiment II showed that the facilitating effects of 
retrieval cues depended on the pairing of storage cues and the words 
to be recalled. The results were interpreted in terms of the storage tags 


generated during input. 


A trial in a typical free-recall learning 
(FRL) experiment consists of two phases, 
the items being presented for study during 
the input or storage phase, and recall being 
tested during the output or retrieval phase. 
In a recent study, Tulving and Pearlstone 
(1966) presented nouns belonging to cate- 
gories during the input phase of FRL. 
Category names of the nouns were always 
presented along with the nouns and served 
as storage cues (e.g., weapons—BOMB, CAN- 
NON; crimes—TREASON, THEFT). During the 
output phase the same category names 
Were presented as retrieval cues in some 
conditions but not in others. Recall of the 
Nouns was highest when retrieval cues were 
Presented. This effect of retrieval cues 
may be dependent on the storage cues being 


_ Presented during input. That is, the effect 


of retrieval cues may be greater when 
storage cues are presented than when they 


| are not presented. This hypothesis is tested 


in Experiment I. 
Experiment I 
Method 


_ Every 8 was given a single trial on the same 
list of 35 nouns. Each noun was taken from a 
different category in the Cohen, Bousfield, and 

hitmarsh (1957) norms with the mean category 
Tequency being 29.17. The design was a 2 X 2 
factorial in which the first variable was the pres- 
hee or absence of storage cues and the second was 
© presence or absence of retrieval cues. The 
our conditions formed by these variables are 
designated as C-C, C-NC, NC-C, and NC-NC. 
Twelve undergraduates from the State University 
of New York at Binghamton were assigned to 
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each condition, The Ss were run in small groups 
numbering from one to five. The groups were 
randomly assigned to the conditions subject to 
the restriction of achieving an equal number per 
condition. 

The nouns were read at a 3-second rate. When 
storage cues were presented, Ss were told that a 
category name (the category label from the 
Cohen et al., 1957, norms) would be read before 
each noun (e.g., a bird—raven), They were told 
that they would not have to recall the category 
names, but that the category names would help 
them recall the nouns. When storage cues were 
absent, only the nouns were read (e.g., RAVEN). 
Immediately following presentation, a 4-minute 
recall period was given. When retrieval cues were 
presented, the 35 category names were listed in 
the same order as they occurred during input 
with a blank space beside each one. Brief instruc- 
tions at the top of the recall sheet told S that 
each noun presented could be described by one 
of the categories and that he should print each 
noun he could recall beside its appropriate cate- 
gory. When retrieval cues were absent, only the 
blank spaces were provided, and brief instructions 
at the top of the recall sheet told S he should 
print the nouns in these spaces. The Ss in both 
retrieval cue conditions were told to recall the 
nouns in any order they desired. They were not 
informed of their condition of recall until the 


time of recall. 


Results 

The number of nouns correctly recalled 
by each § was computed (Table 1). Recall 
was higher when storage cues were pre- 
sented than when they were absent, F 
(1, 44) = 5.54, p < 05, and recall was 
higher when retrieval cues were present 
than when they were absent, F’ (1, 44) = 
16.49, p < .01. The most important finding, 
however, was the significant interaction, 
F (1, 44) = 1051, p < .01, which was 
further analyzed by individual F tests. It 
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TABLE 1 " 
Mean Numer or Correct REsPONSsES 


Retrieval cues 
Storage cues 
Absent (NC) Present (C) 
Absent (NC) 18.33 14.33 
Present (C) 12.25 21.17 


was found that Group C-C recalled signifi- 
cantly more nouns than each of the other 
groups, the Fs (1, 44) being 26.67, 15.66, 
and 20.58, for the comparisons with Groups 
C-NC, NO-C, and NC-NC, respectively. 
None of the comparisons among the latter 
three groups approached significance, F's (1, 
44) < 1.46. 

Discussion 

This significant interaction shows that 
retrieval cues produce greater recall than 
no retrieval cues when storage cues are 
presented, thus replicating the findings of 
Tulving and Pearlstone (1966), but have 
no effect on recall when storage cues are 
absent. Essentially this same finding also 
has been obtained by Wood (1967) in work 
reported after the completion of the present 
research. In Experiment I Wood reported a 
significant interaction in which retrieval 
cues had a larger effect when storage cues 
were presented, but, unlike the present ex- 
periment, the effect of retrieval cues was 
not eliminated completely when storage 
cues were absent. The effect was elimi- 
nated, however, in Experiment II when 
category frequency was comparable to the 
present study. 

It seems possible, as was the case in 
Experiment I above, that the facilitation 
which occurs from retrieval cues when 
storage cues are presented (Tulving & 
Pearlstone, 1966; Wood, 1967) may be due 
to the appropriate pairing of the cues 
with the nouns during storage, but it may 
also be due to some other aspect of present- 
ing the category names during storage, 
Experiment IL determines this possibility 
by comparing three conditions: C-C, C;-C, 
and C-NC. In Condition C;-C category 
names are presented as storage and re- 
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trieval cues, but the cues are inappro- 
priately paired with the nouns during 
storage. Recall should be greater in Condi- 
tion C-C than in C-NC, again demonstrat- 
ing the effectiveness of retrieval cues when 
storage cues are present and appropriately 
paired. If the appropriate pairing of 
storage cues and nouns is important for 
this effect of retrieval cues, then recall in 
Condition C-C should be greater than in 
C,-C; however, if the effect of retrieval 
cues results from the presentation of the 
storage cues per se, then Condition C-C 
should not differ from C;-C. 


Exprrment II 


Method 


Sixteen undergraduate female students from 
the State University of New York at Bingham- 
ton were assigned to each of the three conditions: 
C-C, C:-C, and C-NC. The method was the 
same as Experiment I except for the following: 
(a) In the C:-C condition the category names of 
the nouns (the category labels from the Cohen 
et al., 1957, norms) were randomly paired with the 
nouns as storage cues and subsequently presented 
at recall as retrieval cues; and (b) Ss were told 
that the names to be presented might or might 
not help them recall the nouns. The same thing 
was added to the brief instructions at the top 
of the recall sheet for Ss having retrieval cues. 


Results 


The mean number of nouns recalled was 
23.94, 12.50, and 12.38 for Conditions C-C, 
C,-C, and C-NC, respectively. An analy- 
sis of variance on these groups was signifi- 
cant, F (2, 45) = 37.95, p < .01. In- 
dividual F comparisons indicated that 
recall was higher in Condition C-C than 
Condition C-NC, F (1, 45) = 57.53, p < 
.01, thus showing the facilitation from 
retrieval cues when storage cues are present 
and appropriately paired. Recall was also 
higher in Condition C-C than Condition 
C1-C, F (1, 45) = 56.30, p < .01, suggesting 
that the facilitation produced by zene 
cues in Condition C-C is associated wil 
the appropriate pairing of the storage eu 
and nouns during presentation and n° 
simply the presentation of the cues per 
Recall did not differ significantly in Condi” 
tions C;-C and C-NC, F < 1.0. 
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Discussion 

The presentation of an appropriate cate- 
gory name as a storage cue may lead to a 
stable association between the category 
name and a noun. That is, the storage cue 
may provide the category name as a 
storage tag for a noun (eg., Yntema & 
Trask, 1963). When storage cues have been 
presented, the presentation of retrieval 
cues would reinstate the storage tags so 
that the associations could be used to re- 
trieve the nouns, and recall would be 
higher than when the retrieval cues are 
not presented to reinstate the storage tags. 
However, when storage cues have not been 
presented, storage tags are not provided. 
While Ss likely produce their own storage 
tags, it would seem unlikely that each 
noun is tagged by a category name. There- 
fore, the presence of retrieval cues would 
not lead to greater recall than the absence 
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of retrieval cues. In summary, the principle 
that emerges is that retrieval cues may 
facilitate recall when they are successful 
in reinstating the storage tags that were 
formed during input. 
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A CROSS COMPARISON OF WRITTEN AND SPOKEN 
RESPONSES WITH VISUAL AND AUDITORY 
CONFIRMATION IN A PAIRED-ASSOCIATE 
SIMULATED READING TASK 


JANICE T. GIBSON anp BARBARA L. HAYDEN 
University of Pittsburgh : 


64 college students learned a list of novel-symbol-nonsense-syl- 
lable paired associates in a task that simulated early reading. A 
2 X 2 factorial design was employed in which Ss either wrote or 
spoke their answers, and in which confirmation of results was either 
visual or auditory. All experimental groups required the same num- : 
ber of trials to reach criterion in the initial learning task. Later, 
the same groups decoded triphoneme “words” composed of the 
learned symbols. The Ss who wrote their answers made fewer er- 
rors than those who spoke them. There was a significant interaction 
between response and confirmation modes in this task. With writ- 


ten responses, visual confirmation produced fewer errors. With ] 
spoken responses, auditory confirmation produced fewer errors. 


The process of learning to read has never 
been explained in its entirety. As a result, 
methods of teaching reading often are 
haphazard, and attempts to correct read- 
ing problems often are unsuccessful. In 
order to develop more effective methods of 
teaching reading, it would be helpful to 
first analyze the reading process, break it 
down into its components, and then de- 
termine the variables affecting each com- 
ponent. This approach is not without 
precedent; Gibson (1965) discussed the 
value of analyzing beginning reading as a 
perceptual decoding process and experi- 
mentally manipulating the variables af- 
fecting it. 

The reading process also can be con- 
sidered a verbal learning task. Reading 
experts and other learning theorists al- 
ready have compared the beginning stages 
of reading by phonies to paired-associate 
(PA) learning (Fries, 1963; Levin, Wat- 
son, & Feldman, 1964; Piekarz, 1963). In 
the PA task, S learns to make a specific 
and appropriate response each time a non- 
sense syllable is presented visually. Begin- 
ning readers also learn to make appro- 
priate responses to what initially are 
nonsense graphic stimuli. A child learning 
by phonies often learns to decode a new 
word letter by letter without first having to 
hear the word in its entirety or to know 
what it means. This happens frequently 
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when a phonic alphabet such as the Pitman : 
Initial Teaching Alphabet is used. 9 
Verbal learning tasks have been used in 
the laboratory to study some of the van- 
ables affecting reading. Levin, Baum, and 
Bostwick (1965) and Levin and Watson 
(1965) used lists of PAs in which words 
composed of novel symbols served as stim- 
uli and familiar English words as te 
sponses to study the effects of different 
letter-sound associations on reading. 
Bishop (1964) simulated the process of 
learning to read by teaching college stu- 
dents to read some Arabic terms compose 
of novel graphic stimuli. These stimuli 
presumably were as novel to college stu- 
dents as the letters of the alphabet are to 
beginning readers. ; 4 
The learner’s method of responding 20 
the instructor's method of confirming 1 
sults are two variables operating both oi 
the beginning stages of reading and ee 
learning. In beginning reading, the ¢ hs 
can respond by writing or Spero gi 
answer to the printed letter or word. i 
teacher may provide either visual con: 


mation (showing him the correct word) 0° 


=; ote 


auditory confirmation (telling him). Ric 
two response modes and these confirma 


* modes both have been compared in labora- 


i i ‘onflicting Te 

tory studies of PA lame oe peat : 
- McGeoe 

1964; i that 


sults have been reporte 
(Cummings & Goldstein, : 
& Irion, 1952; Otto, 1961) have state 
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it is impossible to establish one method of 
presenting information as unequivocally 
superior to another. It is important to 
note, however, that previous studies of 
response or confirmation modes have in- 
vestigated only one or another of these 
variables. No study has made use of an 
experimental design that would allow for 
simultaneous investigation of both response 
and confirmation modes. As a result, the 
possibility of an interaction between these 
two variables has never been explored. 
Such an interaction conceivably could be 
used to establish the conditions under 
which one method of presenting informa- 
tion is superior to another. 

The purpose of this study was to in- 
vestigate the effects of two variables on the 
learning of a PA task that was similar in 
many respects to beginning reading. These 
two variables were (a) Ss method of re- 


sponding, and (b) H’s method of confirm- _ 


ing results. To investigate these variables, 
college students were taught to associate a 
series of novel-symbol-nonsense-syllable 
pairs in the usual PA paradigm, The 
symbols used were unfamiliar but dis- 
criminable to Ss. They were presented 
visually so as to simulate the earlier 
Stages of reading described by Gibson 
(1965). Responses were either written or 
spoken; confirmation was visual or audi- 
tory, The Ss’ initial learning rate of the PA 
task and later ability to decode triphoneme 
“words” were measured. 


MetHop 


Subjects 


The Ss for this study were 64 male and female 
Undergraduate introductory psychology students 
at the University of Pittsburgh. 


Apparatus? 


Each § was run individually in an experimental 
Toom. When § was ready, H projected the first 
novel symbol on a Sawyer’s Mirascreen 3 feet in 
SS 


. * The apparatus used in this research was de- 
Signed and constructed by Robert H. Gibson, Uni- 
Versity of Pittsburgh. The projector, interval 
timer, and Language Master were used together 
in such a manner that stimuli could be presented 
either by visual or auditory means, or by both 
Means simultaneously. Confirmation could be 
8iven by the same three methods. This equipment 
'S useful for studying the effects of different types 
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front of him, using a Kodak Carousel 800 pro- 
jector. A Hunter decade interval timer controlled 
display duration. When the symbol had appeared 
for 1 second, the projector advanced to a blank 
slide, and § responded, either by writing or speak- 
ing his answer. He then pushed the stimulus- 
control button, bringing either visual or auditory 
confirmation of results, Visual confirmation con- 
sisted of the novel-symbol-nonsense-syllable pair 
projected on the screen for 1 second by the 
Carousel projector. Auditory confirmation con- 
sisted of simultaneous presentation of the novel 
symbol on the screen and the sound of the non- 
sense syllable played back for 1 second by a Bell 
and Howell Language Master. Visual confirma- 
tion duration was controlled by the Hunter timer; 
auditory display time was controlled by the Lan- 
guage Master. A series of 10 symbol-nonsense- 
syllable pairs were presented in this fashion and 
repeated until a criterion of one successful trial 
was reached. 


Procedure 


The Ss were randomly assigned to learn the 
list of 10 PAs under one of the four following 
conditions: 

1. Written response-visual confirmation. (These 
Ss wrote their answers and then viewed the correct 
response.) 

2. Written response-auditory _ confirmation. 
(These Ss also wrote their answers, but heard the 
correct response.) 

8. Spoken response-visual confirmation. (The 
Ss spoke their answers, then viewed the correct 
response.) j 

4, Spoken response-auditory confirmation, (They 
spoke their answers and then heard the correct 
response.) 

All Ss performed two tasks. Task 1: Ss learned 
a list of 10 novel-symbol-nonsense-syllable pairs 
in the usual PA paradigm. During the first trial, 
each S responded when the symbol appeared by 
pushing the stimulus-control button. This brought 
either visual or auditory presentation of the cor- 
rect nonsense syllable. Thereafter, each S was 
taught to respond with the correct nonsense 
syllable to each of the novel stimuli according to 
the method prescribed by his experimental con- 
dition. Written-response groups wrote their answers 
on paper, using a separate page for each answer. 
Spoken-response groups spoke their answers as 
E recorded. The first session ended when S com- 
pleted one errorless trial. Task 2; 10 minutes after 
the first task was completed, Ss decoded 10 
triphoneme “words,” each composed of three 
symbols learned in the first session, Hach 8 used 
the same response mode as in Session 1, 

Analyses of variance were performed for the 
four groups using response and confirmation modes 


of stimulus presentation and feedback on learn- 
ing. It is described in detail in an unpublished 
study by R. H. Gibson and J. T. Gibson, “A 
Device to Provide Both Auditory and Visual 
Feedback in Verbal Learning Studies.” 
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as the independent variables. Dependent variables 
were trials to criterion in Task 1 and number of 
errors in Task 2. 


RESULTS 


An analysis of variance showed that the 
choice of response and confirmation modes 
had no effect on trials to criterion in Task 
1. Differences did not reach the .05 level 
of significance. As an additional test for 
differences between groups in initial learn- 
ing, the number of overlearning trials (i.e., 
the number of correct trials for each syl- 
lable beyond two correct anticipations) 
was compared for the four experimental 
groups. Again, analysis showed no differ- 
ences. There was no apparent difference 
whether Ss spoke or wrote their answers, 
or whether they saw or heard the correct 
answer. 

When the number of errors made on 
Task 2 was studied, however, differences 
were found between the four groups. Anal- 
ysis of variance showed significant differ- 
ences in error rate due to the response 
modes used (7 = 15.65, p < .01). Figure 1 
shows that written-response groups made 
fewer errors than did groups using spoken 
responses. Analysis of variance also showed 
a significant interaction between response 
and confirmation modes in this task (F = 
11.06, p < .01). When Ss wrote their an- 
swers, visual confirmation produced fewer 
errors than did auditory confirmation. 
When Ss spoke their answers, the reverse 


was true; auditory confirmation produced 
fewer errors. 


Ewritten- visual 
Zwritten-auditory 
Sspoken-visual 

4:spoken-auditory 


ERRORS. 


T 3 
RESPONSE — CONFIRMATION MODE 


Fic. 1. Mean number of errors in Task 2, 


ry 


Janice T. Grsson AND Barpara L. Haypen 


Discussion 


Tt is interesting to note that while ap- 
parently all groups required the same num- 
ber of trials to criterion in Task 1, they 
responded differently from one another 10 
minutes later in Task 2. Two differences in 
the natures of these tasks may have caused 
this finding. First, learning rate was meas- 
ured in the earlier Task 1, while retention 
was measured later in Task 2. This differ- 
ence in dependent variables alone may 
have been responsible for the results, A 
second explanation lies in the relative 
complexities of the two tasks. Task 1 re- 
quired simple sound-symbol associations, 
Task 2 required decoding or reading of 
triphoneme “words” composed of the 
learned symbols. These “words” approxi- 
mated what Gibson (1965) considered to 
be higher order units of structure. Most 
researchers agree that reading higher order 
units entails more than simple sequential 
letter-sound associations. It is possible 
that success in Task 2 required a form of 
learning that was not measured by the de- 
pendent variables in Task 1. 

The Ss who wrote their answers in Task 
2 made fewer errors than did Ss who spoke 
them. There are two explanations possible 
for the superiority of the written response 
in this situation. McGeoch and Irion 
(1952) suggested that the effect of a par 
ticular sense modality may depend on S's 
familiarity with it. It appears valid to ¢x- 
tend this theory to cover familiarity with a 
particular response mode. In other words, 
the superiority of the written response may 
be due simply to the fact that college stu- 
dents have had much more experience dur- 
ing their school careers in writing answel® 
than in speaking them. However, another 
and different explanation of the superiority 
of the written response may be made 
Monroe (1933) wrote that kinesthetic Us 
sponses such as “writing in the air” helpe 
remedial readers to discriminate between 
words. This type of response may have 
been useful because it focused attention O” 
the accurate spelling of the words ee 
made discrimination easier. In the be 
study, the written response may have wa 
tioned in a similar fashion to make © 
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crimination easier. (Although they - re- 
corded no data for this occurrence, Es 
noted that Ss who spoke their answers of- 
ten slurred or omitted the final consonant 
of the nonsense syllables. The Ss who wrote 
their answers, on the other hand, were 
forced to make true letters, and thus per- 
haps made more careful discriminations.) 
Finally, a discussion of the interaction 
between response and confirmation modes 
in Task 2 is in order, The Ss who wrote 
their answers made fewer errors when 
visual confirmation was used. When Ss 
spoke their answers, auditory confirmation 
was more effective. McGeoch and Irion 
(1952) felt that the effect of a particular 
sense modality was due to the S’s famil- 
iarity with that mode. It should follow 
that an S given experience at receiving 
feedback through a particular sense modal- 
ity would be able to respond more effec- 
tively through that same sensory mode. 
This explanation would serve also to ex- 
| plain why the college students in the 
written-response-visual-confirmation group 
made fewer errors than any other group. 
As regards the implications for teaching 
| reading, the interaction found between re- 
sponse and confirmation modes suggests 
that the teacher should decide what he 
wants the student to do before he selects a 
confirmation mode. Visual means of pro- 
viding correct answers probably are most 
effective in teaching reading when achieve- 
ment is based on writing proficiency. Au- 
ditory confirmation probably is more ef- 
fective when speaking proficiency is desired. 
Several major questions still need to be 
_ answered. First, were the results obtained 
in this study due to differential familiarity 
with the particular systems? In this experi- 
ment, Ss were of college age and were un- 
doubtedly more familiar with the visual- 


455 


confirmation-written-response system than. 
small children first learning to read. Sec- 
ond, most reading experts agree that read- 
ing is more than the association of sounds 
with words; it involves, in addition, the as- 
sociation of meaning with these sounds and 
words, A question still remains, therefore, 
of whether the same results would be ob- 
tained with meaningful words rather than 
the meaningless terms used in this study. 
A comparison of the response-confirmation 
modes in the decoding of meaningful terms 
by adults and young children would answer 
both these questions. This research cur- 
rently is being planned. 
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Developmental Psychology will contain articles which represent the broad range of 
growth and development and their major associated variables. Chronological age 
as well as sex, socioeconomic status and effects of physical growth variables (in- 
cluding time of maturation and body build) are all considered relevant develop- » 
mental variables. ) 

Although the adolescent and the aging population appear to deserve particular 
attention at this time, Developmental Psychology will indeed cover the entire span. 
Cross-species articles and articles concerning developmental research with retard- 
ates will also be included. 

The articles in Developmental Psychology will not be limited to any one approach 
such as “experiments versus studies” or “cognition versus motivation’. 
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CRITICAL MOMENTS IN TEACHING... 
Five New Films 


Critical Moments in Teaching is the exciting new series of 16 mm. color films 
that fills an urgent need in teacher education. It provides the teacher-in-frain- 
ing with vital experience in making decisions about actual teaching problems. 
Each film describes a problem which might confront either a secondary or 
elementary school teacher—such as the basis on which students should be 
graded or the motivating of a class. The problem itself is elaborated in detail, 
but no solution is given or implied. 
The series can be used as a stimulus for teacher-directed class discussion, 
role-playing, independent study of analysis, or professional analysis. 
Walls 
the problem of teaching an unresponsive, passive, high school class. color, 
1014 min., $125.00. 
A Child who Cheats 
handling a problem of cheating in the classroom. color, 10 min., $115.00. 
The First and Fundamental R 
the problem of teaching reading to elementary school children in a low 
socio-economic area. color, 12 min., $145.00. 


| Walk Away in the Rain 
the problem of motivating a highly capable adolescent who puts forth 
minimal effort in school work. color, 11 min., $130.00. 
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Image in a Mirror 
working with an elementary school child who is lacking in self-confidence. 
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STIMULUS AND RESPONSE 


John A. Barlow 


In programmed form, a development of contiguity theory and its application to some fundamental 
elements of operant psychology. Instructor’s Manual. 199 pp.; $4.75 paper 


PSYCHOLOGICAL FOUNDATIONS OF EDUCATION 
SECOND EDITION 
Morris L. Bigge and Maurice P, Hunt 


A well-documented, problem-oriented text taking a semihistorical and comparative approach to 
the basic topics of how children develop through adolescence, how they learn, the relationship 

_ between development and learning, and how knowledge of this relationship promotes more ef- 
fective teaching. Instructor’s Manual. 608 pp.; $9.95 


PSYCHOLOGY IN THE CLASSROOM, sEconp EDITION 


Rudolf Dreikurs 


This practical manual, grounded in the Puilssonby of democracy and the socio-teleological ap- 
proach of Adlerian psychology, provides ackground information and methods necessary to deal 
with behavior problems and learning deficiencies of students. 286 pp.; $3.75 paper 


REPORT WRITING IN PSYCHOLOGY AND PSYCHIATRY 


Jack T. Huber 


Covers approaches to formulating a case, outlines for formulation, outlines for reports in special 


areas, confidentiality, thera) 4 TH moples of 
Tabor Writing. 114 ua Phat Ene techniques of writing, supervision, and examp! 


LEARNING AND HUMAN ABILITIES: 
EDUCATIONAL PSYCHOLOGY, srEconp EDITION 


Herbert J. Klausmeier and William Goodwin 


Emphasizes the concept of emerging human abilities, thus i F F ceowill 
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and learning. Student Tiealustion Guide. 720 pp.; 99.50. Student Workbook $3. rH 
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General Psychology 


By David C. Edwards, Iowa State 
University 


Educational Psychology 

Third Edition 

By Glenn Myers Blair, R. Stewart 
Jones, and Ray H. Simpson, all of the 
University of Illinois 


Readings in Educational 
Psychology 

Second Edition 

Edited by Victor H. Noll and Rachel 
a Noll, both, Michigan State Univer- 
sity 


An Introduction to Educational 

Research 

Third Edition 

By Robert M. W. Travers, Western 
ichigan University 


Write to the Faculty Service Desk for examination copies. 


THE MACMILLAN COMPANY 


In Canada, write to Collier-Macmillan Canada, Ltd., 
1125B Leslie Street, Don Mills, Ontario 
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This is a brief yet thorough introduction to psychology—its 
content, its terminology, and its methods. The author has 
accomplished brevity by carefully selecting the important 
topics, and by avoiding unnecessary or irrelevant exposition, 
and discussion of the obvious. The approach is modern, and is: 
concerned with psychology as a science of behavior. The text: 
is suitable for either a brief or a standard course in general, 
psychology in which supplementary materials are used. A’ 
Teacher’s Manual is available gratis. on : 

1968, 394 pages, $5.95 


Workbook to accompany General Psychology 
1968, 115 pages, paper, $1.95 
' 
This successful textbook, now in its third edition, gives stu 
dents and graduate teachers practical, up-to-date help inj 
managing the learning process in keeping with sound psycho+ 
logical theory. It contains much new material—especially in 
the sections on teacher self-appraisal, the psychology of adol. 
escence, the social psychology of teaching, and diagnostic and 
remedial procedures. Employing a developmental hae 
throughout, the authors examine the growth of the child; the 
use of psychological tools to evaluate educational programs; 
and some of the factors which influence the professional growth 
and mental health of the teacher. A Teacher’s Manual is avail- 


tis. is 4 : 
able, gratis. 1008, 704 pages, $8.08 


This thoroughly revised collection of readings provides the 
student in educational psychology with a source of significan 

and relevant literature in the field. Twenty-eight of the articles 
are new and reflect a changing emphasis in educational psy 
chology. Especially written for this compilation are articleg 
by Robert C. Craig, Robert L. Ebel, Ruth Strang, Elisabeth 


d William Gnagey. 
aT att P8°Y” 1968, 432 pages, paper, $4.50 


h material in the third edition of this fine text has been 
con to achieve a simpler presentation throughout. Th 
entire book has been rendered increasingly useful to the Mas: 
ter’s Degree candidate by the elimination of much highly, 
technical material and by the inclusion of new chapters on, 
computers, content analysis, and classroom devices for record4 
ing and analysis. The theme remains, however, that only educa+ 
tional research based on theory will have lasting results. Thé 
text is organized around the main approaches to research int 
the behavioral sciences: survey methods, prediction methods; 
experimental methods, historical methods, case studies an 


environmental methods. 
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Applying Theory To Practice 


PSYCHOLOGY AND TEACHING 

Third Edition 

By Norman T. Bett, Michigan State University 
Davw D. Starks, University of Michigan 


. This thoroughly revised study is a basic text for an educational psychology course, 
” -a useful guide for teachers, and an excellent supplementary reference for courses 
in teaching. The overall organization consists of discussion on the role of the 
teacher, the nature of children, and the nature of learning. The general approach 
of the text stems from the application of psychological data to the operation of a 
secondary or elementary classroom and the combination of psychological prin- 
ciples with the consideration of actual teaching situations. The plan is emphasized 
by directing special attention to the group aspects of educational psychology, 
classroom discipline procedures, and the problems of disadvantaged children. 


Extensive measures are taken to facilitate the assimilation of material. A brief 
introduction to each chapter identifies the major conceptual materials and im- 
portant principles presented in the discussion. Opportunity to apply learning to 
_ actual classroom situations occurs periodically in the reading through the addition 
of updated case studies. Special programmed chapters are included which contain 
charts to be filled in by the student. These charts culminate at the conclusion of 
the work in a comprehensive summary of all the material. In addition, self-testing 
exercises appear at the end of each programmed unit and frequent reference to 
preceding material, integrated into the flow of the text, allows the student to 
review, test, and reinforce his learning. All materials have been tested on samples 
from the student population and edited on the basis of that testing program. 


' This edition contains twelve completely new readings, a rewritten reference 
manual on general psychology, use of new references and suggestions for further 
reading. Psycuotocy anp TEACHING avoids emphasizing any one cause in favor 
of considering all aspects of educational psychology. 


Ready Spring 1969, approx. 768 pages, prob. $9.75. 
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FIVE PROBLEMS? 


DONALD M., JOHNSON, GEORGE L. PARROTT, ann 
R. PAUL STRATTON 


Michigan State University 


Different groups of college Ss wrote one solution or many solutions to 
problems, like the plot-title problem, of verbal, numerical, and 
pictorial material. Instructions to write many solutions yielded solu- 
tions of lower mean quality but more superior solutions. Informa- 
tion about criteria for good solutions raised quality. Large quantity 
was associated with low quality, both for variations in conditions 
and for individual differences within conditions. Differences between 
problems in the quality-quantity relation were dependent on the 
number of superior solutions to the problem. 3 types of judgment 
training—individual, dyadic, and tutorial—interpolated between pro- 
duction of solutions and selection of the best solution were generally 
successful and, under certain favorable conditions, improved overall 


PRODUCTION AND JUDGMENT OF SOLUTIONS TO 


performance. 


Problem solving, like learning, is not a 
simple homogeneous activity. This truism 
is important for the analysis of problem 
solving because a statement that holds for 
one component process may not hold for 
another. It is important also for attempts 
to improve problem solving because a pro- 
cedure that facilitates one process may not 
facilitate another, Therefore, the research 
to be described separates problem solving 
into three different but functionally inter- 
dependent processes: preparation, produc- 
tion, and judgment. ‘ 

Intellectual tasks begin with some kind 
of preparation, most often the acquisition 
and organization of information, as by 
listening to instructions or by reading a 
printed paragraph. After preparation, 
some tasks are primarily productive, as in 
writing many uses of a brick; and some 
are mostly matters of judgment, as In 
Selecting the best answer on & multiple- 
choice test. The present research is con- 


1 This investigation was supported by the U. 8. 
Office of Education, Project No. 5-0705. A final re- 
Port containing several supplementary studies will 
be available in mimeographed form. 
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cerned with that large class of problems, 
not often investigated, for which both pro- 
duction and judgment are required. The 
S produces several possible solutions, then 
examines these and picks one as his best 
effort. In such cases production is different 
from preparation but depends on it. Judg- 
ment is different from production, but it 
is the solutions produced that are judged. 
The different processes are interdependent, 
but they can be studied separately, and 
conditions that influence each can be ex- 
perimentally manipulated. 

Early attempts at analysis of thinking 
into component processes (Dewey, 1910; 
Wallas, 1926) suffered from the inade- 
quacies of a subjective method. More ob- 
jective analyses of protocols obtained from 
poets and artists by the thinking-aloud 
method have been published (Patrick, 1935, 
1937). Similar procedures have been used 
to study the problem-solving processes of 
students at work (Bloom & Broder, 1950; 
Burack, 1950) and to collect data for com- 
puter simulation of thinking (Newell, Shaw, 
& Simon, 1958). Another procedure (John- 
son, 1960, 1961) maintains more control 
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over S's activities by serial exposure of the 
materials of the problem. This has per- 
mitted identification and timing of two or 
three operations, and study of the contribu- 
tion of each to the whole problem-solving 
enterprise with only slight disturbance of 
overall performance. Factor analysis of 
individual differences in success on minor 
problems often leads to factors that are 
labeled as problem-solving processes (Guil- 
ford, 1967), but the technique can also be 
applied to time spent on each process 
(Johnson & Jennings, 1963). Records of 
time spent on preparation, production, and 
judgment by college students doing three 
plot-title problems yielded low correlations 
between different processes and high corre- 
lations between like processes. Aside from 
the specific findings these studies have 
demonstrated the feasibility of analysis of 
the solution of problems into a few proc- 
esses by manipulation of instructions and 
exposures and the study of their interrela- 
tions, 

Brainstorming, although aimed at prac- 
tical rather than analytical goals, depends 
on an implicit analytical assumption. The 
hypothesis that deferment of judgment 
improves production of ideas rests on the 
assumption that judgment and production 
are different processes that can occur at 
different times. (The other hypothesis of 
brainstorming, that social interaction fa- 
cilitates production, is not relevant to the 
present research.) Instructions and train- 
ing are directed toward freeing S from 
premature self-criticism. Meadow and 
Parnes (1959) found that a 30-hour course 
in creative problem solving did increase 
quantity and quality of output on certain 
tasks such as Guilford’s test of producing 
unusual uses for common objects. Meadow, 
Parnes, and Reese (1959) compared brain- 
storming instructions, emphasizing quan- 
tity of production and ignoring quality 
with nonbrainstorming instructions, em- 
phasizing quality and penalizing for ideas 
tated poor. More solutions of high quality 
came from the brainstorming group. Parnes 
and Meadow (1959) alsn found a difference 
in favor of brainstorming instructions and, 
in addition, a high correlation between 


quantity and quality. That is, those 
who produced more solutions produced more 
superior solutions. Later, Parnes and Mea- 
dow (1960) tested some of their students 
8 months after a course in creative prob- 
lem solving and found improvement on all 
tests, including plot titles rated for high 
quality, as compared with control groups. 

A rather detailed study of instructions 
by Gerlach, Schutz, Baker, and Mazer 
(1964) raised questions that must be con- 
sidered in further research. They wrote six 
sets of instructions, including brainstorm- 
ing, nonbrainstorming, penalty. for bad re- 
sponses, and what they called “criteria- 
cued” instructions. The instructions for 
this group read: “The more imaginative or | 
creative your ideas, the higher your score | 
will be. Each idea will be scored in terms of 
how unique it is, how valuable it is... the 
more original and creative the better.” On 
the familiar test of writing uses for a coat 
hanger the criteria-cued instructions did 
yield the most good responses, hence the 
authors argue that the improvement at- 
tributed to brainstorming could be due to 
learning a criterion of quality. This inter- 
pretation, which is quite different from the 
original conception of brainstorming, sug- 
gests that the improvement may be in the 
judgment process as well as the production 
process. 

Records of production over time should 
help to explain total output. It is conceiv- 
able that S produces his best solution on 
the first try and, if pushed to produce more, 
produces solutions of lower quality. It is 
also conceivable that the good solutions — 
appear later. To date, the results have been 
conflicting. Christensen, Guilford, and Wil- 
son (1957) found that the production of 
simple responses decreases during a working 
period of about 15 minutes while the pro- 
duction of plot titles was linear and the 
quality of these titles was constant. John- 
son and Jennings (1963) had their Ss write 
five plot titles and found that the best one 
occurred in each position equally often. 
Parnes (1961) found an increase in the 
number of unusual uses and takes this 
result as an argument for extended effort 
in brainstorming. Apparently the course © 


Propuction ap JupaMenT or So.vtions 10 Five Prosems 3 


production is different for simple responses 
and for complex solutions, but further re- 
search is needed. 

This brief review of relevant experiments 
points up the need for a fundamental exam- 
ination of the interrelated contributions of 
production and judgment to problem solv- 
ing. The results obtained to date suggest 
certain methods to be tried. and certain 
errors to be avoided. An adequate descrip- 
tion of the processes leading to the final 
solution requires detailed analysis of a 
large number of solutions and some addi- 
tional variations in conditions. Curiously, 
although the outputs of Ss producing many 
solutions under different instructions have 
been compared, the standard condition has 
been overlooked. Instructing S to write one 
solution to a problem may be considered 
the standard condition to which more com- 
plicated conditions should be compared. 
The question of how well Ss can judge their 
own solutions must be examined because 
assumptions about self-criticism are com- 
monly included in speculations about 
thinking. A previous study (Johnson & Jen- 
nings, 1963) with limited data indicated 
that the accuracy of college students in 
judging their own solutions to the plot- 
title problem was above chance levels 
but not very high. The possibility of in- 
creasing accuracy in judgment is worthy of 
serious investigation since improvement has 
been obtained under some conditions (John- 
son & Zerbolio, 1964). 

The measures of interest are the number 
of solutions produced by each S, the aver- 
age quality of these solutions, and the num- 
ber of superior solutions. Differences be- 
tween Ss on these measures may also be 
enlightening. Reliable ratings of the solu- 
tions are necessary; other methodological 
considerations, suggested by reports of 
Previous research, will be mentioned below. 

The first experiments analyzed the con- 
tributions of production and judgment to 
Problem solving. Since the results empha- 
Sized the role of judgment, brief programs 
for training judgment were prepared, and 
the later experiments evaluated the im- 
Provement obtained. The same problems 


and some of the same procedures were em- 
ployed in all experiments. 


STANDARDIZATION OF PROBLEMS 


The present research is focused on 
problems with many solutions that cannot 
be dichotomized as right or wrong but can 
be graded in respect to such qualities as 
usefulness, appropriateness, cleverness, and 
originality. This might be called productive 
thinking as well as problem solving. In 
Guilford’s (1967) terminology it is diver- 
gent thinking as opposed to convergent 
thinking. Another consideration was that 
the problems should be substantial prob- 
lems that would offer some challenge to 
college students. Tasks as simple as writ- 
ing uses for a brick or giving uncommon 
associations to words have been criticized 
as trivial. Any punster knows that the set 
for simple verbal productions is highly 
vulnerable to variations in instructions 
and social atmosphere. 

It was necessary to choose problems 
yielding solutions that vary considerably 
in quality and that can be reliably rated 
as to quality. Finally, within the restric- 
tions of the design, the problems should 
vary in content. The literature on problem 
solving contains many findings that apply, 
as far as is known, to only one problem. 
The use of several problems permits gen- 
eral principles to emerge as well as dif- 
ferences between problems. 

The plot-title problem, which has been 
used in some of the research mentioned 
above, meets these specifications; and it 
is also a good test of the originality fac- 
tor, according to Wilson, Guilford, and 
Christensen (1953). Reading the instruc- 
tions for the problem and then the plot 
itself may be considered the preparation. 
Production consists of writing titles for it. 
Judgment consists of selecting the best of 
these as the final solution. 

Twelve problems were constructed along 
the lines of the plot-title problem, two 
examples each of six types, and tried out 
with several samples of college students. 
Four were chosen on the basis of the num- 
ber of solutions written in 5 minutes, the 
correlation between ratings of the solu- 
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tions by two raters, and the standard de- 
viation of the ratings. Two other types, 
Consequences and Plot Completions, did 
not yield satisfactory interrater agreement. 
The problems chosen, as well as the plot- 
title problem, are briefly described below, 
together with instructions for multiple 
solutions. 


Plot Titles. A paragraph gave the plot of a 
story or movie, with the following instructions: 

“Your task is to think of titles for the story. 
Read the plot then write as many titles for it as 
you can.” 

Table Titles, A table of agricultural data, show- 
ing four columns of statistics for seven time 
periods, was printed, with the following instruc- 
tions: 

“Table X below has reference to United States 
statistics, and was taken from a past volume of 
the World Almanac. You are to examine the table 
and then write as many titles for it as you can.” 

Conclusions. A chart was printed, with column 
diagrams representing social-welfare expenditures 
under five public programs for six time periods, 
with the following instructions: 

“Figure XVI is taken from The Statistical 
Abstracts of the United States, 1964. What can you 
conclude from this table? Write short sentences, 
ag many as you can, each of which summarizes a 
generalization from this table.” 

Sentences. Wallach and Kogan (1965) asked 
children to write short stories using four words. 
For Present purposes with adult Ss the integra- 
tion of four words in a single sentence seemed 
preferable : 

“Write many sentences, each of which con- 
tains these four words.” 

happy expensive horse _lake 

Cartoon. A cartoon was presented in four 
squares, with the printing removed from the last 
square. The instructions were to “write ag many 
different quotes for the last square as you can.” 


_ Although the solutions were all written 
in words, the materials presented included 
verbal, numerical, and Pictorial materials 


in order to achieve some variety of con- 
tent. 


Exprrment I: Propuction 
AND JUDGMENT 


Method 


Instructions 


When Ss are instructed to ignore quality or not 
to be critical, they write more solutions, but the 
qualitative and quantitative effects of the in- 


structions are confounded. Hence Group 1 wag 
instructed to write one solution to each problem, 
and Group 2 was instructed to write as many ag 
possible. There was no reference to quality in 
either case. 

Group 3 was instructed to write as many solu- 
tions as possible, then to select the best. This 
variation was planned as a check on Ss’ ability to 
judge their own solutions and as a comparison 
between preferred solutions and the single solu- 
tions produced by Group 1. 

Groups 4 and 5 were included to examine the 
effects of criteria-cued instructions reported by 
Gerlach et al. (1964). The instructions asked for 
“good” or “clever” solutions, and these were 
followed by more specific criteria based on the 
criteria developed by the raters: 


Plot Titles. By clever we mean an imagina- 
tive, creative, or unusual title for this plot. 

Table Titles. A good title is a comprehensive 
one that includes the important points con- 
cisely. 

Conclusions. A good conclusion would be a 
valid generalization which integrates the table 
as a whole, 

Sentences. A good sentence reads smoothly; 
the four words fit unobtrusively into the struc- 
ture of the sentence. ‘ 

Cartoon. A clever quote is an imaginative 
idea that fits the cartoon. 


Group 4 received the criteria-cued instructions, 
along with a request to write as many as possible, 
Group 5 received the same instructions, and, like 
Group 3, was requested to select the best solu- 
tion later. 


Procedure 


Since each S was to do five problems, the 
possibility of order effects arose. In Groups 3 and 
5, especially, after S did the first problem and dis- 
covered that postproduction judgment was re- 
quired, performance on later problems could be 
influenced. Therefore five orders were arranged, 
with each problem appearing once in each posl- 
tion. 

Thus 25 types of booklets were prepared: separ 
rate forms for the five groups, and five orders for 
each form. Each problem was printed on a pase 
of 8% X 11 inch paper; the different orders were 
arranged when the pages were stapled in booklets. 
Following each of the five problem pages, Groups 
3 and 5 had additional pages with the instruc 
tions: “Now turn back to the titles (or sen” 
tences, etc.) you wrote and pick the best one. Put 
a check mark (1/) beside it.” Other groups hi 
filler pages of irrelevant material expected to Te 
quire the same working time. saa 

Since some of the standardization Ss complaine 
that 5 minutes did not suffice, 7 minutes were 
allowed for each problem. 


. 


Propuction AND JupamenT of SoLvrions 10 Five ProstEms 


Subjects 


The Ss were 200 students in general psychology 
at Michigan State University, mostly freshmen 
and sophomores, divided into five groups of 40 
each, within which there were five orders. Hight 
each of the 25 types of booklets were distributed 
serially during a regular class meeting. Hight 
booklets were returned incomplete, so eight Ss, 
drawn from the same population, were run later 
to fill the missing cells. 

Scores on the College Qualification Test (CQT) 
and a locally constructed reading test were avail- 
able for most students. Table 1 shows that the 
five groups were quite similar in respect to these 
scores. (Since the scores for two Ss could not be 
found, the means of Groups 1 and 3 are based on 
only 39 scores). 


Results 


The results consist of 5,215 solutions, 
about 1,000 for each problem. Each solution 
was typed on one side of a card, and code 
numbers were typed on the other side. The 
cards for each problem were shuftied and 
given to two judges for blind rating. 


TABLE 1 
Muans anp SranpArD DpviaTions oF Five 
Grours on Two TusTs 


College tater ents 


Agreement Between Judges 


The judges had rated the solutions ob- 
tained earlier and had cooperatively written 
general criteria and specific points on 4 
Seale of 1 (low quality) to 7 (high quality) 
for each problem. They used solutions 
from incomplete papers for additional 
Practice, discussion, and refinement of 
Tatings. After this practice, amounting to 
about 30 solutions per problem, they rated 
all solutions independently, one problem 
at a time. The two ratings, together with 
Code numbers, were then punched on cards 
for electronic data processing. iuitee 

Since the reliability of the ratings 1s 


TABLE 2 
CorreLation Cozrmcrmnts Compurmp sy 
Buocxs or 200 Souurions 10 Suow Inrmrsupau 
AGREEMENT THROUGHOUT RaTING PxRIOD 


Block 
Problem 
| 2 3 4 5 6 7 
Plot Titles 548 | .688 | .596 | .746 | .764 | 747 
Table Titles 988 | .988 | .986 | .984 | 074 ig 
Conclusions 748 | .908 | 922 | .924 | . 
Sentences +866 | .808 | .822 | .808 | .787 
Cartoon 658 | .748 | .748 | .795 | .740 | .787 


crucial for this type of research, the solu- 
tions were rated in blocks of about 200 
and interjudge correlations were computed 
for each block. Table 2 shows that Conclu- 
sions and Sentences were the easiest to rate, 
but in general the agreement was adequate 
and fluctuations over time were small. (The 
last correlation in each line of Table 2 is 
based on less than 200 solutions.) Subse- 
quent computations use the sums of these 
ratings, which range from 2 to 14. 


Order Effects 


The data for each problem were inspected 
for order effects, but no such effect ap- 
peared, either for number of solutions or 
mean rating of the solutions. Hence the 
data obtained from the five positions have 
been combined in later analyses, 


Number of Solutions Produced 


Each S of Group 1 wrote one solution as 
instructed, but the frequencies in the four 
multiple-solution groups, shown in Table 
8, were influenced by the more specific in- 


TABLE 3 


Mzan Numser or Sotvrions per Sussecr to 
Fivn Propiems Propucen By Five Groves 
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TABLE 4 
Frequencies or Ratines, on ScaLz or 2-14, or Sotutions To Five Prosiems 
Rating 
Problem 

2 3 4 5 6 7 8 9 10 it 12 13 | 14 
Plot Titles 140 | 125 | 249 | 217 | 232 | 146 | 120 52 30 18 4 4 4 
Table Titles 145 15 93 10 73 15 | 209 13 | 117 5 | 124 1 | 62 
Conclusions 123 25 59 36 | 247 54 75 55 | 123 46 68 | 24 | 23 
Sentences 39 34 88 | 256 | 151 | 205 81 83 26 9 5 | 4 
Cartoon 55 | 381] 93 78 | 181 | 166 | 239 | 137 79 29 20 8] 1 
Total 502 | 196 | 528 | 379 | 989 | 532 | 848 | 338 | 432 | 124 | 225 | 42 | 94 


structions. In contrast to Groups 2 and 3, 
Groups 4 and 5 received the criteria for 
good solutions. In contrast to Groups 2 
and 4, Groups 3 and 5 had to identify their 
best solutions. The effects of these varia- 
tions on each problem were tested by a 
2 X 2 analysis of variance, with 40 Ss in 
each cell. The criteria-cued instructions sig- 
nificantly reduced productivity on Plot 
Titles (p < .01), Conclusions (p < .01), 
Sentences (p < .01), and Cartoon (p < 
025). The reduction was not significant 
on Table Titles, but it was significant when 
each S’s total production on all five prob- 
lems was treated as a single score (p < .01). 
Thstructions to select the best solution did 
not influence productivity. 


Distribution of Ratings 


Table 4 shows the distributions of the 
ratings of the solutions to the different 
problems for all five groups combined. 
Some positive skew is apparent. The judges 
attempted to spread out their ratings but, 
as one might expect, the ratings piled up 
at the low end. 

The irregularity that appears in the dis- 
tributions should be noted. When interjudge 
agreement is high, the sum of two ratings is 
usually an even number. Odd sums occur 
only when raters disagree, The greater fre- 
quency of even sums over odd sums ap- 
pears most sharply in solutions to Table 
Titles and Conclusions because the relia- 
bility of the ratings was very high in each 

case, This effect has no bearing on the pres- 
ent research. 
The consequences of the production of 


many solutions can be best understood by 
examination of complete distributions of 
ratings. The clearest comparison is be- 
tween Groups 1 and 2, which differed only 
in number of solutions produced, hence 
Figure 1 has been prepared to facilitate 
this comparison for each of the five prob- 
lems. The irregularity mentioned above has 
been removed by combining even and odd 
sums, with the exception that the highest — 
ratings, 12, 13, and 14, have been combined 
in one interval. In general, when Ss were 
instructed to write many solutions, they 
wrote more superior solutions, more medi- 
ocre solutions, and more inferior solutions. 
Since they wrote more solutions of all de- 
grees of quality, they wrote more solutions 
of high quality. 


Intraindividual Variability 


A fundamental characteristic of produc- 
tive thinking under instructions to produce 
many solutions is the variation in quality 
of the solutions produced by each S. It 1s 
conceivable that some Ss might write mostly 
inferior solutions, others mostly mediocre 
solutions, and others mostly superior solu- 
tions, but the raw data do not show such 
results. For example, the printout of the 
ratings of each of the eight Ss of Group a 
who had the plot-title problem in the first 
position show that the individual ratings 
extended, respectively, from 2 to 8, 2 to 4 
2 to 11, 2 to 11, 2 to 13, 4 to 7, 5 to 9, aD 
6 to 7. A tabulation of the ranges for the 
160 Ss of Groups 2-5 on each of the five 
problems, taken one at a time, is displaye 
in Table 5. Since the complete range ° 
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SUMMED RATING 
Fra, 1. Distributions of ratings of solutions to five problems by 40 Ss who wrote one 


solution (Group 1) and 40 Ss who wrote many solutions (Group 
represents the sum of two independent ratings on a scale of 1-7. Odd 
been combined to remove irregularities, except that the highest ratings, 


been placed in one interval.) 


the ratings was 13 (2-14), Table 5 demon- 
strates that the variability in quality of 
the solutions produced by single Ss was 
large. Almost all Ss who wrote more than 
one solution wrote solutions of a wide range 
of quality. The median range per S per 
problem was about 7. (Only 786 sets of 
solutions are represented in Table 5 be- 
cause 14 Ss wrote only one solution.) 


Mean Quality of Solutions 

In order to compare the multiple-solution 
groups with the single-solution group, each 
8 in the multiple-solution groups was as- 
signed a score representing the mean of the 
ratings of all his solutions to a problem. It 


SUMMED RATING 


2). (The base line 
and even sums have 
12, 13, and 14, have 


is the mean of these 40 means for the 40 
Ss that is entered in each cell in Table 6, 
The standard deviations of these distribu- 
tions of mean scores are generally smaller 
than the standard deviations of the dis- 
tributions of single ratings for Group 1 
because the averaging reduces the varia- 
bility. 
TABLE 5 


INTRAINDIVIDUAL RANGES IN RATINGS 
oF SOLUTIONS 
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TABLE 6 
Means anp STANDARD DgyraTions oF RatiInGs oF SoLutions To Five Prosiems 
Propucep By Five Groups 
Plot Titles Table Titles Conclusions Sentences Cartoon Total 
Group 

M SD M SD M sD M SD M SD Mu SD 
1 5.78 | 2.66 | 9.22 | 3.39 | 9.12 | 4.62 | 7.52 | 2.26 | 7.55 | 2.22 | 7.84 | 1.80 
2 5.49 | 1.24 | 6.59 | 3.03 | 6.83 | 1.73 | 6.99 | 1.42 | 6.86 | 1.01 | 6.56 -95 
3 5.27 | 1.04 | 8.18 | 2.61 | 6.74 | 1.78 | 7.19 | 1.26 | 7.27 | 1.25 | 6.90 Bri? 
4 5.72 | 1.14 | 8.22 | 3.01 | 7.91 | 1.94 | 7.55 | 1.55 | 7.16 | 1.24 | 7.81°] 1.10 
5 5.53 -99 | 8:23 | 2.78 | 7.65 | 2.40 | 7.99 | 1.06 | 7.13 | 1.52 | 7.31 -89 


The important differences, due simply to 
instructions to write many solutions, is 
between Groups 1 and 2. For all five prob- 
lems mean quality is higher for Group 1. 
When tested by simple ¢ tests, taking ac- 
count of the heterogeneity of variance, the 
difference is significant for Table Titles 
(p < .01) and Conclusions (p < .01). When 
these ratings are combined as a total qual- 
ity score for each § on all five problems, 
Group 1 has a mean of 7.84 and an SD of 
1,80, while Group 2 has a mean of 6.56 and 
an SD of .95 (t = 3.98; p < .01). It is gen- 
erally true in this experiment that instruc- 
tions to write many solutions reduced the 
mean quality of the solutions. To evaluate 
the magnitude of this difference the two 
distributions of solutions to all five prob- 
lems were combined as one distribution of 
1,578 solutions. In respect to this combined 
distribution (with positive skew), the Group 
2 mean was at the fifty-ninth percentile 
while the Group 1 mean was at the seventy- 
seventh percentile, 

_ The effects of the more specific instruc- 
tions given to Groups 2-5 were evaluated 
by a 2 X 2 analysis of variance for each 
problem with the 40 means for Ss in each 
cell, The request to select the best solution 
had no significant effect on any problem. 
The criteria-cued instructions, however 
increased mean quality on all five problems. 
This effect was significant at the 01 level 
for Conclusions and Sentences and, of 
course, for total solutions. Thus informa- 
tion about the criteria for good solutions 
not only reduced the number of solutions 
produced but also increased the average 
quality of these solutions. Average quality 


still did not reach the level of Group 1, 
however, except for one problem, Sentences, 


Number of Superior Solutions 


For some purposes the number of superior 
solutions is the important measure, but the 
choice of a cutting point to define a supe- — 
rior solution is somewhat arbitrary. The 
ninetieth percentile is a reasonable choice 
because it limits the sample to solutions of 
relatively high quality yet provides an ade- — 
quate number for statistical analysis, ap- 
proximately 100 for each problem. There- 
fore the lower limit for superior solutions — 
was set at the integer closest to the ninetieth 
percentile of the total distribution of rat- 
ings for each problem. For example, the 
ninetieth percentile of the distribution of 
ratings of plot titles (see Table 4) fell be- 
tween 8 and 9. Hence solutions rated 9 or 
above were considered superior. There were 
112 superior solutions by this definition, — 
slightly less than the ideal 10%. Po 

Table 7 shows that seven of these superior ~ 
solutions to the plot-title problem were 
produced by the 40 Ss of Group 1, writing / 
single solutions, and 29 by the 40 Ss of 
Group 2, writing many solutions. In general, 
the multiple-solution groups produced more — 
superior solutions than the single-solution 
group, though Conclusions is a possible ex- 
ception. 

The percentages in Table 7 were calou- 
lated with the number of solutions for se 
group, shown in Table 3, as the base. i 
is, the 29 superior solutions to the plot-title 
problem produced by Group 2 were 8% © 
the 377 solutions produced by that group: 
This measure, like the mean quality meas 
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ure, favors groups that wrote single solu- 
tions, while the absolute number of superior 
solutions favors groups that wrote many. 

The effects of instructions to write many 
solutions on the number of superior solu- 
tions appear most clearly in the comparison 
of Groups 1 and 2. Simple ¢ tests show that 
the difference is significant for Plot Titles 
(p < .01), for Sentences (p < .01), for 
Cartoon (p < .05), and for total (p < .01). 

Groups 1 and 2 cannot be compared in 
respect to mean number of superior solu- 
tions per S on each problem because most 
Ss produced none. The groups can be com- 
pared, however, in respect to the number 
of Ss who produced at least one superior 
solution. The difference in favor of Group 2 
is significant for Plot Titles (p < .01) and 
for Sentences (p < .05). For all five prob- 
lems the totals are 78 and 47. Only 22 of 
the superior solutions in Group 2 came 
from Ss who contributed more than one. 
Evidently many of the superior solutions 
were produced by Ss who would not have 
produced a superior solution had they been 
instructed to produce only one. 

It is possible to compare Groups 1 and 2 
in respect to total number of superior solu- 
tions per S on all five problems because 
these distributions are not seriously 
skewed. The mean number per S for 
Group 1 was 1.17, with an SD of .86. The 
mean for Group 2 was 2.50, with an SD of 
1.69, and the difference between the two 
means is significant (p < .01). All these 
measures indicate that Ss produce more 
superior solutions when they are told to 
write many. 

Although the superior solutions are the 


ones that attract the most attention, it is 
theoretically interesting to note that the 
same quantity increase that increases the 
number of superior solutions also increases 
the number of inferior solutions, If a rat- 
ing of 2 is taken as indicating an inferior 
solution, Group 2 produced 171 inferior 
solutions to all problems, while Group 1 
produced only 23. 

The effects of the more specific instruc- 
tions are shown by comparing the number 
of superior solutions produced by Groups 
2-5, as shown in the last column of Table 7. 
Tnstructions to select the best do increase 
the number of superior solutions slightly. 
Of the 506 superior solutions produced by 
these four groups 275 came from Groups 
3 and 5, with instructions to judge their 
solutions. This proportion, .55, is signifi- 
cantly larger than the proportion implied 
by the null hypothesis (p < .05). The effect 
of the criteria-cued instructions is of the 
same magnitude. Of the 506 superior solu- 
tions 280 came from Groups 4 and 5, with 
knowledge of the criteria for good solu- 
tions. Thus the best condition was that of 
Group 5 who were given the criteria for 
good solutions and instructions to select 
their best, This group contributed 149 of 
the 506 superior solutions, which is 29% 
rather than 25%. The finding by Gerlach 
et al. (1964) of the value of criteria cues 
is thus confirmed both for mean quality 
and for number of superior solutions. 

Supplementary analyses were carried out 
using a 95% level as defining a superior 
solution. The numbers were smaller and 
the significance levels were lower, but the 


TABLE 7 
Frequencizs (f) AnD PERCENTAGES OF SUPERIOR Soxurions sy Groups AND PRopLEMs 
Plot Titles Table Titles Conclusions Sentences Cartoon . Total 
ee f % f % f % f % f % f % 
47 23 
12 20 | 50 7 | 18 8 | 2 
2 BS 3 "4 4 7 6 2 | 10 21 8 100 7 
28 8 12 5 19 7 25 | 10 42 | 14 126 9 
t 18 7 20 10 32 16 34 7 28 ll 132 ab 
5 30 10 17 27 15 37 38 16 149 hi 
Total 112 8 63 7 115 12 127 4 137 12 554. 
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same picture of the results emerged. None 
of the above statements were contradicted. 


Accuracy of Judgment 


Those who write many solutions write 
more superior solutions, but they write 
more inferior solutions as well. The critical 
question is one of judgment. Will S be able 
to select his best? If he writes superior 
solutions and selects his worst or even his 
average solution, nothing is gained. Table 
8. displays the pertinent comparisons for 
Group 3, with instructions to write many 
and then choose the best, and for Group 5, 
with the same instructions plus information 
about the criteria for good solutions. In 
each comparison of this table there are 
39 or 40 preferred solutions for each 
problem—a few Ss did not identify their 
best—while the number of nonpreferred 
solutions ranges from 141 to 310. Each S 
was given a score representing the difference 
between the rating of his preferred solution 
and the mean rating of his nonpreferred 
solutions, and the ¢ tests indicate whether 
the means of these difference scores are 
significantly greater than zero. It is clear 
from Table 8 that the solutions selected as 
best were better, by and large, than the 
others and that in some cases the differences 
were quite small. 

Comparison of the number of superior 
solutions among the preferred and the non- 
preferred solutions, as shown in Table 9, 
is complicated by the quantity effect; there 


are more superior solutions where there are 
more solutions. 


TABLE 8 


Mean Ratines or PREFERRED AND 
Nonrrererrep Souvrions 1N 


Two Groups 
Group 3 Group 5 
Problem 
Pre- | Non- Ne 
ferred trea] |ferrea|,PTe] 
Plot Titles 5.47 | 5.21] 97 | 6.23 | 5. 
Table Titles 9/26 | 740 | aioae» | ovat | Sa sae 
Conclusions 7.30 | 6.62 | 1.41 | 8.52 | 7/39 | 1243 
Sentences 7.19 | 7.19} 00 | 8.10 | 7.56 | 2\05* 
Cartoon 7.87 | 6.91 | 2.41% | 7:85 | 7.08 | 1:77" 
Total 7.42 | 6.68 | 1.83% | 8.02 | 7.06 | 2.20" 
*p < 05. 
** < 01. 


TABLE 9 
FREQUENCY OF SuPERIOR RATINGS AMONG 
PREFERRED AND NONPREFERRED 
So.utions 1n Two Groups 


Group 3 Group 5 


Problem 
Preferred | NOBPYe- |Preferred| Nonpre- 


Plot Titles 4 24 4 26 

Table Titles 4 8 7 10 

Conclusions |. 6 13 13 4 

Sentences ae 20 8 29 

Cartoon 10 32 10 28 
| 

Total 29 97 42 107 


It should be noted that Ss of Group 5 
had information about criteria when they 
produced their solutions. The difference 
between Group 5 and Group 3 is due to the 
influence on production of information 
about the criteria. The difference between 
preferred and nonpreferred in Group 5 is 
additional to this difference in information. 

Since it has now been demonstrated that 
Ss did select their best solutions with some 
degree of accuracy, the comparison with the 
standard condition of Group 1 becomes 1m- 
portant. The means of Group 1, shown in 
Table 6, are higher for three of the five 
problems than those of the preferred solu- 
tions of Group 3, shown in Table 8, hence 
instructions to write many and select the 
best were not helpful. The means of the 
preferred solutions of Group 5 are better 
than the means of Group 1 for four of the 
five problems, but the differences are all 
small. It appears that instructions to write 
many solutions lower mean quality but with 
criteria information Ss can select one of 
their productions which is about equal to 
the single ones written under standard con 
ditions. These comparisons are summarize 
in Table 10. : 

The frequency of superior solutions 
among the preferred solutions for all 
problems was 29 for Group 3 and 42 for 
Group 5. The standard condition of Group 
1 yielded 47. These frequencies are comp? 
rable since Ss in Group 1 wrote one solution 
each and Ss in Groups 3 and 5 chose one 
preferred solution each, By this measure 
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TABLE 10 
Mran Ratine anp Numeper or Superior 
Sonurrons To ALL PRoBLems BY 
Five Groups 


Group Mean quality |ayJiier slations 

1 7.84 47 
2 6.56 100 
Preferred 7.42 29 
Nonpreferred 6.68 97 
iran aes 131 

5 
Preferred 8.02 42 
Nonpreferred 7.06 107 


the standard condition is at least as good 
as any of the variations. 

Thus one measure, mean quality of solu- 
tions identified as best by Ss, gives a slight 
advantage to Group 5, but the other meas- 
ure, number of superior solutions, gives a 
slight advantage to Group 1. Hence no 
stable advantage can be claimed for either 
condition, In general this analysis of ac- 
curacy of judgment leads to the conclusion 
that the judgment of Ss was good but not 
good enough. Although Ss of Group 5 had a 
wide range of solutions to choose from and 
information about the criteria that the ex- 
pert raters used, their preferred solutions 
were not consistently better than the single 
solutions of Group 1. 


Order of Production 


To determine whether the superior solu- 
tions appeared early or late in the course of 
production, the sequences produced by the 
four multiple-solution groups, which var- 
ied in length from 2 to 22 solutions, were 


TABLE 11 


Number or Supertor SoLuTions 1N Frrst AND 
Last Portions oF PropuctioN SEQUENCES 


Problem First coe 

Plot Titles 40 55 
Table Titles 25 27 
Conclusions 44 47 
Sentences 43 64 
Cartoon 60 59 
Total 212 252 


divided into first and last halves, A few 
superior solutions produced in the middle 
position of sequences of odd length were 
discarded. The results of these counts are 
shown in Table 11, For Plot Titles and 
Sentences there was a preponderance of 
superior solutions in the last half, but this 
was significant only for Sentences (p < 
.05). The difference in totals across all five 
problems was not significant. 

These totals were then analyzed by 
experimental groups. For Group 2 there 
were significantly more superior solutions 
in the last half (p < .01). The remaining 
totals were in that direction though not 
significant. Group 2, compared to the other 
multiple-solution groups, most closely con- 
forms to the instructional conditions used 
by Gerlach, et al. (1964) and Parnes 
(1961) where a production-order effect was 
found with less complex problems. Simi- 
larly, Gerlach et al. did not find this ef- 
fect with criteria-cued or evaluation in- 
structions. 


Individual Differences and Correlations 


Since each S wrote solutions to five prob- 
lems, there were 10 interproblem correla- 
tions for the quality of the single solutions 
and the number of superior solutions in 
Group 1, and in each of the four multiple- 
solution groups there were 10 interproblem 
correlations for number of solutions, mean 
quality, and number of superior solutions. 
Scores were also available for the CQT, 
taken at entrance. This test has three parts 
—verbal, general information, and numer- 
ical—but inspection of the correlations for 
the part scores did not reveal any relations 
not shown by the total score, hence only 
the correlations for the total CQT are 
mentioned here. Also available were scores 
on the Michigan State University (MSU) 
Reading Test, designed to measure ability 
of college freshmen to read textual ma- 
terials in several academic areas. Since 
there were 40 Ss in each group, correlations 
above .31 are significant at the .05 level. 

In Group 1 the correlations for the qual- 
ity rating of the solution between prob- 
lems, and between problems and freshman 
tests were all positive but low, with a me- 
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dian of .16. For number of superior solutions 
the parallel correlations were also low, and 
some were negative, but when each S's total 
number of superior solutions on all five 
problems was treated as a score, this total 
correlated .33 with the score on the MSU 
Reading Test and .45 with CQT. These 
correlations from the single-solution group 
are the easiest to interpret because each S 
probably did his best on his single oppor- 
tunity. They indicate that the abilities re- 
quired for these five problems in the stand- 
ard condition have something in common 
with the abilities required for conventional 
tests of college aptitude. 

In each of the multiple-solution groups, 
Groups 2-5, there were 10 correlations be- 
tween problems in number of solutions pro- 
duced. The 40 correlations for all four 
groups ranged from 0 to .70, but most were 
in the .40s and .50s, and the median was .45. 
This is evidence of the well-known factor 
of verbal productivity or meaningful 
fluency. The correlations with Reading and 
CQT were negligible. 

The 40 corresponding correlations for 
mean quality rating ranged from —.33 to 
40, with a median of .11, and the correla- 
tions with aptitude test scores were negli- 
gible. Instructions to write many solutions 
probably have different effects on the qual- 
ity standards of different Ss. 

The 40 correlations for number of su- 
perior solutions ranged from —.26 to .53, 
with a median of .06. When each 8's total 
number of superior solutions on all five 
problems was treated as a score, the eight 
correlations between this total and aptitude 
test scores were all positive but low. 

The most interesting correlations in the 
data for Groups 2-5 are those bearing on 
the old question of the relation between 
quantity and quality. Do Ss who write the 
most solutions write the best? There were 
five correlations between number of solu- 
tions produced and mean quality of these 

solutions, one for each problem in each 
group, 20 for the four groups. With two 
exceptions these correlations were negative, 
ranging from —.53 to .13, with a median 
of —.27. Each S’s total number of solutions 
on the five problems was also correlated 


with the mean quality of all these solutions, 
yielding correlations for Groups 2-5, re- 
spectively, of —.13, —.382, —.47, and —.18, 
From these two ways of working up the 
data on 160 Ss, it appears that those who 
wrote many solutions tended to write solu- 
tions of inferior quality. Thus these data 
on individual differences in quality agree 
with the data on differences between con- 
ditions. Groups that wrote many solutions 
wrote solutions of lower quality than 
groups that wrote only one, and, within the 
multiple-solution groups, those who wrote 
more solutions than average wrote solu- 
tions of less-than-average quality. 

Similar correlations were computed be- 
tween number of solutions produced by 
each S and number of superior solutions. 
These 20 correlations ranged from —.28 
to .40, with a median of .09. They were 
more often positive than negative, but they 
were far smaller than the correlations re- 
ported elsewhere for simpler problems. For 
four groups writing unusual uses, Parnes 
and Meadow (1959) reported correlations 
between total number of solutions and num- 
ber of superior solutions ranging from 64 
to .81. Gerlach et al. (1964) confirmed this _ 
with a correlation of .78 for a similar task. 
In the present experiment, with more sub- 
stantial problems, the difference between 
multiple-solution groups, writing about six 
solutions, and the single-solution group was 
large enough that the former groups pro- 
duced more superior solutions. But within 
a multiple-solution group the variation m 
productivity was relatively smaller, and, 
furthermore, the advantage of large quan- 
tity was offset by the reduction in quality 
indicated by the preponderantly negative 
correlations mentioned above. 


Discussion 


The Nature of the Problems 


Some of the differences between the Te 
sults of this experiment and others on pro” 
ductive thinking are due to the nature 0 
the problems. Tasks like writing unusual 
uses for a brick or uncommon associations 
are open to the criticism that they require 


only a superficial fluency. To avoid this r 
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' criticism the present experiment employed 
substantial problems that required S to 


integrate the material presented before 
constructing appropriate solutions. Varia- 


tions in quality of solutions are due to com- 
_ prehension and organization of the material 
_as well as the production process per se. 


Although the five problems were similar 
in form and yielded similar results in gen- 


eral, some differences can be noted. Many 


solutions could be written for all problems, 
but Table Titles required succinct compre- 
hensive titles, and the number of these was 
small. Likewise Conclusions required in- 
ferences that correctly integrated the data 
of the chart, and the number of these was 
small. Comments of Ss and review of the 
solutions to these two problems indicated 
that multiple-solution instructions led 
many Ss to break up the problems and 
write titles and conclusions to portions of 
them, Such partial solutions received low 
ratings because comprehensiveness and 
integration were criteria for good solutions. 
Table 4 shows that 62 table titles and 23 
conclusions were independently given max- 
imal ratings by the two judges, however. 
Hence this analysis does not mean that 
these two problems were especially difficult 
but rather that each had only a small num- 
ber of superior solutions which could not 
be much increased by emphasis on quantity. 
Although for the problems as a whole, the 
groups that wrote many solutions wrote 


more superior solutions, the difference 


hee 


—— 


comes largely from Plot Titles, Sentences, 
and Cartoon. ; 

None of the problems had only one right 
answer but on the continuum from diver- 
gent to convergent thinking Table Titles 
and Conclusions would be located more 
toward the convergent end than the other 
three problems. More important, the above 
analysis offers a preliminary differentia- 
tion of solution processes at different loca- 
tions on this continuum. 


Quantity and Quality 
’ The increase in number of superior solu- 


tions that comes from an increase in quan- 


tity of solutions is in agreement with the 
results of research on brainstorming by 


Parnes and others and extends the results 
to more substantial problems. The explana- 
tion need not be the same, however. Since 
the instructions for Group 2 said nothing 
about ignoring quality or postponing judg- 
ment, the pure quantity effect alone is 
sufficient to account for the difference in 
number of superior solutions, a ratio of 
more than two to one, over Group 1. Ap- 
proximately the same ratio is reported 
when brainstorming instructions are com- 
pared with standard instructions (Meadow, 
Parnes, & Reese, 1959) and when students 
who have had a course in creative problem 
solving are compared with a control group 
(Parnes & Meadow, 1960). By “pure quan- 
tity effect” we mean an effect that operates 
over all degrees of quality, so that when 
more solutions are produced, more superior 
solutions are produced, and that this effect 
is produced by instructions that simply re- 
quest many solutions. It should be noted, 
however, that the reséarch on creative 
problem solving by Parnes and others com- 
pared multiple-solution groups under dif- 
ferent instructions while the present com- 
parison is between multiple-solution groups 
and single-solution groups. 

Another effect of the emphasis on quan- 
tity, not emphasized in previous research, 
is the reduction in mean quality, The dif- 
ference between the fifty-ninth and seventy- 
seventh percentiles is not trivial. The only 
other study using this measure that has 
been found (Weisskopf-Joelson & Eliseo, 
1961) reported a drop in mean quality 
following instructions not to be critical. In 
the present study the drop in quality oc- 
curred on all five problems; it is safe 
to assume that it is a general phenomenon. 


Variability and Production 

The intraindividual variability in qual- 
ity of solutions has not been previously 
reported. A quality dimension on which a 
sample of products from one S can be lo- 
cated with adequate reliability is not often 
available. It is possible that intraindividual 
variability in quality of solutions is larger 
for these complex problems than for the 
simpler tasks of other researchers, but no 
comparative data are available. 
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Whatever may be the mechanism for the 
production of solutions, it is no doubt a 
complicated one, so that the output varies 
on the quality dimension—and probably 
on other dimensions, such as length, clarity 
of handwriting, ete., not considered in these 
experiments: The semiinterquartile range 
of the ratings of the 200 single solutions 
produced by Group 1, computed from the 
distributions shown in Table 4 is about 3. 
Since the median of the ranges of the in- 
dividual Ss of the multiple-solution groups 
is about 7, the semiinterquartile range for 
individuals can be estimated as about 2. 
This last estimate is very rough, but it sug- 
gests that the variability within individuals 
does not differ greatly from the variability 
between individuals. To account for this 
variability there is no necessity at the pres- 
ent time, therefore, to assume any other 
production process for the individual draw- 
ing repeatedly upon his own resources than 
we assume for different individuals draw- 
ing once upon different resources, 


The Contribution of Judgment 


At this point the usefulness of the two 
measures, mean quality of all solutions and 
number of superior solutions, should be 
examined. It is hard to imagine any situa- 
tion in which production of solutions com- 
pletes the thinker’s assignment. Usually 
the assignment is to contribute one solu- 
tion. If the thinker produces only one, 
the quality of that one is of course the only 
measure to be considered. If he produces 
many, he usually eliminates all but one and 
submits that one. If he chooses at random 
from his total production, which he prob- 
ably can do, the appropriate measure for 
research is the mean quality of all solutions. 
It has been seen in the present experiment 
that the highest mean was obtained by 
those who wrote only one solution, Hence, 
if random choice is to be the final opera- 
tion, the best instructions would be to write 
one solution. 

The other measure, number of superior 
solutions, is the one that has most often 
been used in the research situation, but 
it is hard to imagine any other situation in 
which expert judges would examine all of 
anyone’s productions and separate the 


superior ones from the others. This is a 
measure of theoretical interest because it 
pertains to an intermediate operation, 
but the important overall question is 
whether the thinker can produce a superior 
solution and identify it as such in order to 
complete his assignment. The present: study 
shows that college students can identify 
their best solutions with some accuracy, 
But the accuracy is not high. Even when 
they produce superior solutions and try to 
select their best, their choice is no better, 
in general, than the one solution produced 
by those instructed to write only one. 

Since production is not improved any 
more by the procedures of other experi- 
ments than by the simple quantity instruc- 
tions of the present experiment, improve- 
ment of the final evaluation of the solutions 
would appear to be the most promising step 
toward improvement of the overall per- 
formance. If quantity instructions increase 
the number of superior solutions and S 
can be trained to identify his best with 
high accuracy, then perhaps the one solu- 
tion selected will be superior to the one 
produced by those who produce only one. 

The assumption that S produces a variety 
of responses one of which is selected by 
reinforcement contingencies in the environ- 
ment is a familiar one, under such names 
as trial and error, but the selection of one 
response by S as a separate operation has 
been generally overlooked. 


Experiment II: EvaLvation oF 
JupGMENT TRAINING 


There are two questions about improve- 
ment of productive thinking by training in 
judgment. (a) Under multiple-solution in- 
structions, will judgment training increase 
the difference between preferred and non- 
preferred solutions relative to the difference 
obtained without training? (b) Will the 
single preferred solutions, selected by 
after judgment training, be superior to the 
single solutions produced under standard 
single-solution conditions? ; 

The literature offers little guidance 0 
the development of training programs for 
judgment of solutions to problems. The 
one relevant experiment (Johnson & Zer- 
bolio, 1964) gave brief practice in judging 
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plot titles, but only a small, nonsignificant 
improvement was obtained. Hence three 
variations of materials and procedures were 
ried out in a pilot study with 111 Ss judg- 
ing solutions to the sentences problem and 
104 Ss judging solutions to the conclusions 
problem. Observation of the behavior of 
Ss, as well as the objective results, led to 
reparation of materials in booklets that 
seemed to combine the advantages of all 
hree variations. 

Sentences and Conclusions were chosen 
ecause the results of Experiment I indi- 
cated that these two were the most dissim- 
ilar. At this point there is more to be 
learned about possibilities of improvement 
from these than from two that are more 
similar. 


Method 


Training Materials 


The solutions were taken from the results of 
previous experiments; only those on which the 
two expert judges agreed exactly were used. Rat- 
ing guides, which the expert judges had written 
for their own use, were given to Ss, Practice A 
displayed four poor, four good, and four superior 
solutions, with an explanation for the rating of 
each and appropriate reference to the rating guide. 
The Ss were told to study these and, when 
they understood the characteristics of superior 
solutions, to try Practice B. 

Practice B presented 14 triads of solutions to 
the Sentences problem (or 13 to Conclusions), with 
instructions to select the best one of each. Offi- 
cial judgments of the first two triads, with ex- 
planations, were given on the next page, and offi- 
cial judgments of the remainder were also given 
to permit each $ to calculate a score for his 
accuracy of judgment. The first few judgments 
were made rather easy by grouping one very good 
solution with two poor ones. 4 

Practice C stressed the contrast of superior 
and inferior solutions. Seven of each were printed, 
with official judgments and explanations for them. 
Then appeared a block of seven superior solu- 
tions with blank lines for S to write the character- 
isties of superior solutions. This was followed by 
a block of seven inferior solutions with lines for 
the characteristics of inferior solutions. This for- 
mat was repeated with six more superior and in- 
ferior solutions. 


Procedure 

The sequence of events for the training groups 
was: writing solutions to a problem, practice in 
judging, return to the solutions first written to 
select the best- The instructions for the last phase 
were as follows: 


Up to now you have been practicing the eval- 
uation of sentences (or conclusions) which were 
written by other students. Now the real test of 
your judgment ability comes. Go back to the 
sentences which you wrote on the first page. 
Read each sentence carefully and select the 
three (3) best according to the criteria of supe- 
rior sentences which you have been using. Put 
an “X” by each of these three. 

Then, reread these three sentences and put 
a line completely around the one best. 


Three training groups had the same materials 
but in one condition, called individual practice, 
each S worked alone, as under classroom or exam- 
ination conditions. 

The model for dyadie practice was the common 
procedure in research teams when two investiga- 
tors practice together to work up good inter- 
judge agreement in rating statements of opinion, 
taped interviews, and solutions to problems. The 
advantages of dyadic practice seem to be moti- 
vational, in that the two judges stimulate each 
other, and also intellectual, since in discussing 
disagreements they are likely to read the solutions 
and the rating guide more carefully and thus to 
learn the criteria of judgment more specifically. 
The pairs were assembled in small sections of 
8-12 so that H had only 4-6 pairs to supervise. 
The Ss entered a room and took seats at tables 
for two, hence the pairing was presumably ran- 
dom, although it is possible that friends might 
take adjacent seats and thus find themselves 
working as partners. Those dyads that were slow 
in initiating conversation were encouraged by 
questions and procedural, but not substantive, 
hints from #. , 

The third training condition was tutorial prac- 
tice, one S working with one E. Because of the 
time required this condition was limited to one 
problem, Conclusions. The # gave direct help in 
clarifying criteria, explaining disagreements, and 
interpreting the rating guide. : 

Each of the training procedures required 40- 
60 minutes. Groups that wrote many solutions 
and selected their best were allowed 7 minutes 
per problem. Those who wrote only one solution 
required only a few minutes per problem. 


Subjects 


The Ss were students in elementary psychol- 
ogy classes at Michigan State University, 20 in 
each condition. : 

Group 1 was a control group, instructed to 
write one solution. Since little time was required, 
these Ss were given both Sentences and Conclu- 
sions, half in each order, Y : 

Group 2 was instructed to write one solution, 
but was given the criteria for good solutions, as in 
Experiment I, Since the criteria for good solu- 
tions were given to training groups, & single-solu- 
tion group given the criteria was necessary for 
comparison. These Ss were given both problems, 
half in each order. 
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Group 3 wrote many solutions and then selec- 
ted their best. The criteria for good solutions were 
given. Since relatively little time was required, 
these Ss were given both problems, half in each 
order. 

Group 4 had individual practice, with one 
problem. 

Group 5 had dyadic practice, with one problem. 

Group 6 had tutorial practice, with Conclusions 
only, 

When Ss were assembled for dyadic practice, 
one or two were withdrawn to another room for 
tutorial practice, leaving an even number, 

Since sex differences were not considered in 
Experiment I but are occasionally reported in 
problem-solving research, these groups were 
roughly balanced for sex. 


Results 


The solutions were processed and rated 
as described for Experiment I, except as 
noted below. Interjudge agreement was ade- 
quate. Order effects were possible in Groups 
1, 2, and 8 but, in respect to mean quality of 
solutions, these were small and inconsistent, 
Sex differences were also small and incon- 
sistent in all groups. Hence order and sex 
have been ignored in subsequent analyses. 


Sentences 


Mean ratings of the solutions produced 
by the two single-solution groups were 
about the same; the criteria cues were not 
effective (see Table 12). As in Experiment I 
the multiple-solution groups wrote varying 
numbers of solutions, averaging about six 
or seven per S, and the mean quality of 
these solutions was lower than for single- 
solution groups. 

The first question about the effects of 
judgment training is whether the difference 
of the nonpreferred solutions is larger for 
groups with training than for Group 3, 


TABLE 12 


Muan Quauiry or Sentencus Propucep anp 
JupGED UNDER Five Conprmioxs 


Pre- 
ferred 


Nonpre-| 5. 
ferred | Diff. 


Condition Total 


One solution 9.15 


One aulnaae criteria age 
Many solutions . 8.7 5. . 
ny olution, indi: ye 5 8.29 46] .85 
vie i i E Ss 
Many solutions, dyadic pe ial Reta RG 
tri 7 8.75 | 10.45 | 8.31 | 2.14] 4.98¢ 


"p< 01. 


without training. Table 12 shows larger 
differences in both training groups. To test 
for significance a difference score was com- 


puted for each S, as in Experiment I, and an © 


analysis of variance showed a significant 
variance between groups (F = 3.25; p < 
05). Comparing the two training groups 
with Group 3, as a control group, by Dun- 
nett’s ¢ statistic, Group 5 (with dyadic 
practice) was superior to Group 3 p< 
025). In general it appears that the judg- 
ment practice, especially practice in pairs, 
was effective in helping Ss select their best 
solutions. 

The other question about the effective- 
ness of judgment training involves a com- 
parison with the standard condition in 
which § simply writes one solution. Since 
each S in Groups 1 and 2 wrote only one 
solution and each S in Groups 3, 4, and 5 
selected one preferred solution, there were 
20 solutions in each of the five groups for 
this comparison. Analysis of variance dis- 
closed a difference between groups at the 
.05 level. Groups 1 and 2 may be combined 
to represent the standard condition or con- 
trol group, and in comparison with this 
group of 40 solutions the solutions of Group 
5 were superior by the Duncan multiple 
range test adapted for unequal n’s (p < 
05). Neither Group 3 nor Group 4 was 
significantly different from the standard 
condition by this measure. 

Some of these results are displayed graph- 
ically in Figure 2. 

Analysis of the frequency of superior 
solutions shows the same relations but the 
numbers are small. To approximate the 
top 10% of the 434 solutions the nearest 
cutting score was 11, and there were 38 
ratings of 12, 13, and 14. The two single- 
solution groups produced 3 each, while 
Groups 3, 4, and 5 produced 11, 7, and 14, 
respectively. As to the effects of training, 
there were only 2 superior solutions among 
the preferred solutions of Group 3, but 
Group 4 chose 3 and Group 5 chose 7. Thus 


only Group 5, with dyadic practice, was 


better than the control groups. a 
As a measure of the effects of training, 

the frequency of superior solutions is some- 

what misleading because some Ss wrote 
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BB all Solutions 


(Preferred 
BB Nonpreferred 


Ss 


) 


MEAN RATINGS 
© 


GROUPS 


Fic. 2. Effects of judgment training on solu- 
tions to sentences problem. (The Ss in Groups 1 
and 2 wrote one solution each, but Group 2 had 
the criteria for good solutions. Groups 3, 4, and 5 
wrote many solutions, but Group 4 had individual- 
judgment training and Group 5 had dyadic-judg- 
ment training.) 


more than one but could not select more 
than one as the best. If only the number 
of Ss who produced at least one superior 
solution are considered, the percentages of 
preferred solutions relative to these were 
60 for Group 4, and 70 for Group 5, as 
contrasted with only 33 for the untrained 
Group 3. These data are too meagre to 
stand by themselves but they agree, by 
another method of analysis, with the data 
on mean quality. 


Conclusions 


Mean quality of the solutions written by 
Group 2, having the criteria for good solu- 
tions, was slightly higher than that for 
Group 1, but the difference was not signifi- 
cant (see Table 18). The multiple-solution 


TABLE 13 


Muan Quatity or ConcLusIONS PRODUCED AND 
JuDGED UNDER Srx ConpDITIONS 


Pre-_ |Nonpre-} nj; 
erred | Diff. t 


Condition ferred'| "fi 


One solution 


Mens polnieom tie yor 6.49 | 2.16 | 2.40* 
apy solutions 0a 5.6 5.32 | 2.38 | 2.60% 
ae 6.38 5.81 | 2.49 | 2.78°* 
rer ron eater a 4.98 | 1.92 | 1.70 


*p < 05. 
+> < 01. 


groups wrote 4-6 solutions per S, of gen-" 
erally lower quality, as in Experiment I. 
Figure 3 displays these data graphically. 

In three multiple-solution groups the pre- 
ferred solutions were significantly superior 
to the nonpreferred solutions. Evidently 
judgment was easy with the criteria of a 
good solution, and judgment training did 
not add much to accuracy of judgment. 
Hence the differences between preferred 
and nonpreferred solutions were about 
the same in all four groups. 

The quality of the preferred solutions is 
the joint effect of the quality of the solu- 
tions produced and accuracy in selecting the 
best. Table 18 shows considerable variation 


i Ai! Solutions 
1 Preferred 
EB Nonpreferred 


RATINGS 


MEAN 


GROUPS 


Fic. 3. Effects of judgment training on solu- 
tions to conclusions problem. (The Ss in Groups 
1 and 2 wrote one solution each, but Group 2 had 
the criteria for good solutions, Groups 3-6 wrote 
many solutions, but Group 4 had individual-judg- 
ment training, Group 5 had dyadic training, and 
Group 6 had tutorial training.) 


in quality of solutions produced by the four 
multiple-solution groups. Group 3, which 
produced solutions of relatively high qual- 
ity, had preferred solutions of high quality, 
although not as high as the single solutions 
of Group 2. Since Group 6 produced solu- 
tions of low quality, this experiment may 
not provide a fair test of the value of tu- 
torial training. 

Analysis of the number of superior solu- 
tions disclosed the same relations by an- 
other measure. 

The two problems chosen for Experiment 
II were the most diverse of the five problems 
of Experiment I. Conclusions was the only 
problem on which multiple-solution groups 
did not produce distinctly more superior 
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solutions than single-solution groups (see 
Table 7). Conclusions may be a problem 
jn which accuracy of judgment of solu- 
tions is so good that there is little room 
for improvement. Sentences, on the other 
hand, were judged poorly without special 
training. Obviously, what is needed is a 
repetition of Experiment II with other 
problems. 


Experiment III: EvanvaTion or 
Jupement TRAINING 


The plot-title problem and the cartoon 
problem were chosen for further evaluation 
of judgment training, and the same pro- 
cedures were followed, with a few excep- 
tions. In Experiment II no clearcut 
differences were found between single-solu- 
,tion groups with and without criteria cues, 
hence all groups in the present experiment 
were given the criteria for good solutions. 
The groups were composed of 30 Ss drawn 
from the same population as before. Train- 
ing materials were constructed for Plot 
Titles and Cartoon following the materials 
constructed for Sentences and Conclusions. 

The tutorial group of Experiment II did 
not do as well as expected. One possibility 
was that since these Ss were withdrawn 
from one room and taken to another, the 
special conditions may have lowered their 
productivity prior to judgment training, 
Hence, in the present experiment this 
group wrote their solutions in the same 
room with the other Ss and were with- 
drawn later for judgment training. 


Results 
Plot Titles 


The results, analyzed as before, are 
shown in Table 14 and Figure 4, The in- 


TABLE 14 


Maan Quauity or Pior Tittms Propucep anp 
JUDGED UNDER Five ConpiTIons 


Condition fPotal | Pre, |Nompre| pig. | 

One solution 5.97 
Many solutions 5.78 | 5.60 5.86 |—.26| .89 
Many solutions, indi- 

vidual training 5.77 | 77.0 5.33 4,32* 
Many solutions, dyadic als 

raining ® 97 | 5.40 ‘ 
Many solutions, tutorial be 

training 6.87 | 8.18 5.88 4.67* 


*p<.0l. 


B All Solutions 
CO Preferred 
EB Nonpreferred 


J 


MEAN RATINGS 
a _ 


GROUPS 


Fic. 4. Effects of judgment training on solu- 
tions to plot-title problem. (The Ss in Group 1 
wrote one solution each, with the aid of criteria 
cues. Groups 2-5 wrote many solutions, but Group 
3 had individual-judgment training, Group 4 had 
dyadic training, and Group 5 had tutorial train- 
ing.) 


dividual training and the tutorial training 
yielded significant differences but the dif- 
ference due to dyadic training was surpris- 
ingly small. Analysis of variance of the pre- 
ferred solutions from the multiple-solution 
groups together with the single solutions 
from Group 1 disclosed a significant be- 
tween-groups difference (p < .01). As com- 
pared with Group 1, the preferred solutions 
of Group 3 were better (p < .05) and so 
were those of Group 5 (p < .01). 

Table 14 shows that Group 5 wrote solu- 
tions of a higher mean quality than Group 
1. This small difference, which is not sig- 
nificant, is the only case in which instruc- 
tions to write many solutions did not cause 
a drop in mean quality. In Experiment I 
it was Plot Titles that showed the smallest 
drop in mean quality as a consequence of 
multiple-solution instructions. 


Cartoon 


The results for the cartoon problem are 
shown in Table 15 and Figure 5. All three 
types of judgment training resulted in sig- 
nificant differences between preferred and 
nonpreferred solutions, while the corte- 
sponding difference for Group 2, without 
training, was negligible. The preferred solu- 
tions of the groups with judgment train- 
ing were approximately of the same mean 
quality as the single solutions of Group 1g 
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TABLE 15 


Mean Quatity oF Cartoon ComPLETIONS 
PRODUCED AND JUDGED UNDER Five CoNnDITIONS 


Pre-_ |Nonpre-| pig t 


Condition Total) ferred | ferred 

One solution 8.14 

Many solutions 6.64 | 6.46 6.74 |—.28 | 1.12 
Many solutions, indi- 

vidual training 6.35 | 8.14 6.17 | 1.97 | 3.60* 
Many solutions, dyadic 

training 6.56 | 8.74 5.86 | 2.88 | 3.60* 
Many solutions, tutorial 

training 6.73 | 8.04 6.61 | 1.43 | 3.59* 
"p< 01. 


Experiment [V: JupGMENT 
Wirnout PRoDUCTION 


In Experiments II and III judgment 
of the solutions was complicated by pro- 
duction of the solutions. This condition is 
representative of common production-and- 
judgment situations and is a necessary com- 
plication for some experiments, but cer- 
tain comparisons can be made more clearly 
by isolating the judgment process. One 
question concerns the effects of criteria 
cues on judgment. Since criteria cues im- 
proved production in the multiple-solution 
conditions of Experiment I, it is possible 
that information about criteria was re- 
sponsible for a major part of the improve- 
ment attributed to judgment training in 
Experiments II and III. This hypothesis is 
most plausible for Conclusions because 


I All Sotutions 


Ae Preferred 

i Ba nonpreferred 
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Fig. 5. Effects of judgment training on solu- 
tions to cartoon problem. (The Ss in Group 1 
wrote one solution each, with the aid of criteria 
cues. Groups 2-5 wrote many solutions, but Group 
3 had individual-judgment training, Group 4 had 
dyadic training, and Group 5 had tutorial train- 


ing.) 


of the small number of good solutions. 
The alternative hypothesis is that judgment 
training is necessary. Perhaps information 
about criterial and irrelevant dimensions 
can be effectively utilized only after spe- 
cific training, with appropriate feedback. 

The pilot study, mentioned as a pre- 
liminary to Experiment II, included a mul- 
tiple-choice judgment test. Although this 
was intended more for training than for 
evaluation, the data suggested possible use 
as a dependent variable. Therefore new 
multiple-choice judgment tests were con- 
structed for Conclusions, Sentences, and 
Cartoon, incorporating solutions produced 
by Ss of previous experiments and rated by 
the expert judges. The tests consisted of 10 
five-alternative items for each problem of 
varying difficulty, as estimated from the 
quality ratings of the solutions. 


Method 


The standard condition required S to read a 
problem and then take the judgment test. Each of 
the 42 Ss had all three problems, in counterbal- 
anced orders. The criteria-cued condition added 
information about the criteria of a good solution, 
taken from previous experiments, before the 
judgment test. Each of the 41 Ss had all three 
problems, in counterbalanced orders. The judg- 
ment training required more time, hence each S 
had only one problem, then the training work- 
book, and then the judgment test. Thus there 
were 126 Ss in this condition, 42 for each prob- 
lem. This training corresponds to the individual 
training of Experiments II and III. In all con- 
ditions, however, the final judgments on the test 
would not be considered ego-involving, since Ss 
were judging solutions written by others. 

Booklets were assembled and distributed to 
small groups of volunteer Ss who came to the re- 
search building to participate in productive think- 
ing research and receive class credit, 


Results 


Keys for the multiple-choice judgment 
test were made up in advance, and the tests 
were scored mechanically. Mean scores are 
shown in Table 16. The standard deviations 
for the nine distributions were quite sim- 
ilar, varying from 1.23 to 1.66. Analysis of 
variance yielded significant F ratios for 
each problem, and for each problem the 
mean after judgment training was different 
from each of the other two means (p < 
01), according to the Duncan multiple 
range test. In general, these results support 
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TABLE 16 


Muan Scorzs on Jupcmunt Test UNDER THREE 
Conpitions or InsrRucTION AND TRAINING 


Problem F 
Conclusions 17 | 10.33* 
Sentences 88 | 41.27* 
Cartoon 3 


*p< 001. 


the effectiveness of judgment training 
even when criteria cues have been supplied. 


ConcLusions 


A few methodological conclusions are in 
order. These experiments demonstrate the 
potentialities of separating production from 
judgment and manipulating each process 
separately. College students can follow sim- 
ple instructions about production and judg- 
ment without difficulty. Multiple-choice 
tests of judgment have certain advantages 
also; judgment without production can be 
studied as well as production without judg- 
ment, 

Despite the high agreement between the 
two raters in judging Ss’ solutions, the use 
of ratings always raises methodological 
suspicions because two judges can agree on 
many dimensions, not all of which are 
pertinent to the hypotheses under con- 
sideration. It is comforting to note that 
Ss’ preferred solutions got higher ratings 
from the expert judges than the nonpre- 
ferred solutions, and that the difference 
was enhanced by information about the 
the judges’ criteria and by practice in judg- 
ment. 

It is obvious, when overall performance 
is considered, that the ordinary single- 
solution condition should be included. Much 
more enthusiastic conclusions about specific 
procedures could be written if this con- 
dition had been left out. The standard pro- 
cedure of asking S to write a solution to a 
problem is a hard one to beat. 

Instructing S to select his best solution 
is a useful research procedure. Having S 
produce many solutions illuminates the 
. production process, but it is seldom an 


end in itself. Someone has to select the 
best solutions. 

The value of including several problems 
was amply supported. The five problems 
were similar in certain aspects of the re- 
sults, but different in other aspects. 

As compared with the standard condition 
of writing one solution, instructions to 
write many solutions yielded large effects. 
Mean quality went down and the number 
of superior solutions went up. Such in- 
structions may be augmented by instruc- 
tions to defer judgment or ignore quality, 
but the addition of this feature was not 
studied here. Instructions simply to write 
many solutions resulted in roughly double 
the number of superior solutions, about 
the same increase as reported for more com- 
plicated instructions. These instructions 
may be augmented in the other direction 
also, by including the criteria for good 
solutions. This addition decreased number 
produced and increased both mean quality 
and number of superior solutions. The re- 
duction in quality with instructions to in- 
crease quantity is consonant with the nega- — 
tive correlation for individual differences 
in quality and quantity. 

When instructed to write many solutions, 
each S wrote solutions of a wide range of 
quality. The number of superior solutions 
was about the same in the first and second 
halves of a 7-minute production period for 
most problems across all groups, but in the 
case of Sentences the number in the second 
half was significantly larger. When these 
data were analyzed by groups, the pro- 
duction-order effect held only for the group 
which had no criteria-cued or evaluation 
instructions, This concurs with previous 
research. 

As compared to the single-solution con- 
dition, the increase in number of superior 
solutions in multiple-solution conditions 
was due to the quantity effect and the gen- 
eral intraindividual variability, not to 4 
few superior Ss. Each S wrote some good 
solutions and, when asked to select his best, 
selected one that was likely to be better 
than the others. Even so, the preferred 
solutions were seldom better than the solu- 
tions of those who wrote only one each. 


Attempts to improve overall performance 
by three types of judgment training—in- 
dividual, dyadic, and tutorial—were gen- 
erally successful in that the differences 
between preferred and nonpreferred solu- 
tions were increased, though this improve- 
ment varied across problems and type of 
training. Dyadic training, for example, was 
particularly successful on Sentences and 
Cartoon but not on Plot Titles. When con- 
ditions were favorable, that is, when in- 
structions to write many solutions did not 
reduce quality severely and when the train- 
ing in judgment was quite successful, the 
preferred solutions of the multiple-solution 
groups were superior to the solutions writ- 
ten by the standard single-solution groups. 
And a control experiment with a multiple- 
choice test of judgment demonstrated that 
the improvement was due to the judgment 
training, not merely to information about 
the criteria for good solutions. 
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